Skip to content

ShangChienLiu/FDT5_Model

Repository files navigation


FDT5_MODEL

β—¦ FDT5_Model: Transforming AI with Simplicity and Power

β—¦ Developed with the software and tools below.

tqdm SciPy Python GitHub pandas NumPy JSON Markdown

license repo-language-count repo-top-language last-commit


πŸ”— Quick Links


πŸ“ Overview

The FDT5_Model is a system capable of generating engaging questions from specified locations, represented by a vehicle's GPS coordinates and four side street-view images captured by on-car cameras. We utilize data from the Google Street View dataset and craft prompts based on the address obtained through reverse geocoding of the GPS coordinates, complemented by captions from street-view images generated by an advanced image captioning model. This repository demonstrates the use of street views and coordinates from various locations including USA_Pittsburgh, USA_Orlando, USA_NewYork, and Taiwan_Kaohsiung, to create engaging questions.


πŸ“‚ Repository Structure

└── FDT5_Model/
    β”œβ”€β”€ StreetviewFilter/
    β”‚   └── checkpoint-27100/
    β”‚       └── config.json
    β”œβ”€β”€ checkpoints/
    β”‚   └── DistilledStreetviewFilter_T5LargeTeacher/
    β”‚       └── checkpoint-54000/
    β”œβ”€β”€ count_vocab.py
    β”œβ”€β”€ dataset.py
    β”œβ”€β”€ model.py
    β”œβ”€β”€ requirements.txt
    β”œβ”€β”€ results/
    β”‚   β”œβ”€β”€ Taiwan_Kaohsiung.json
    β”‚   β”œβ”€β”€ USA_NewYork.json
    β”‚   β”œβ”€β”€ USA_Orlando.json
    β”‚   └── USA_Pittsburgh.json
    β”œβ”€β”€ single_inference.py
    β”œβ”€β”€ streetview_images/
    β”‚   β”œβ”€β”€ Taiwan_Kaohsiung/
    β”‚   β”œβ”€β”€ USA_NewYork/
    β”‚   β”œβ”€β”€ USA_Orlando/
    β”‚   └── USA_Pittsburgh/
    β”œβ”€β”€ utils_classifier.py
    └── ζŒ‡δ»€.txt

🧩 Modules

.
File Summary
requirements.txt The code snippet in the FDT5_Model repository is responsible for filtering and processing street view images. It utilizes dependencies such as transformers and torch to achieve this. The main files involved are count_vocab.py, dataset.py, model.py, and single_inference.py. The codebase also includes directories for checkpoints, results, street view images, and utility scripts.
count_vocab.py This code snippet is responsible for building a vocabulary based on a given text file. It counts the frequency of words in the text and discards words below a specified threshold. The resulting vocabulary is stored with word-to-index and index-to-word mappings. The main file takes command-line arguments for the text file path and threshold value.
ζŒ‡δ»€.txt The code snippet in the single_inference.py file performs single inference on street view images using given coordinates and locations. It is part of the FDT5_Model repository and relies on dependencies listed in requirements.txt.
model.py This code snippet is part of the FDT5_Model repository and contributes to its architecture. It includes dependencies such as torch and transformers, and defines key files like model.py. The code implements T5-based model operations, including self-attention, cross-attention, and feed-forward layers.
dataset.py This code snippet is part of a larger codebase with a specific directory structure. It depends on the dataset.py file and uses various imports. The main file in this snippet is responsible for initializing a Dataset object, setting various parameters, and preprocessing data if the CLIP library is used.
utils_classifier.py The code snippet contains a Python class called EngagingDataset, which is responsible for creating and managing a dataset for training a machine learning model. It includes methods for preprocessing and organizing the data, as well as for batching and retrieving predictions. The class utilizes dependencies such as torch, pandas, and DataLoader.
single_inference.py The code snippet in single_inference.py is a key file in the repository architecture. It uses various dependencies and software tools to perform single inference tasks for streetview images. It utilizes transformers, torch, pandas, and googlemaps to process data and generate results.
StreetviewFilter.checkpoint-27100
File Summary
config.json The code snippet in the StreetviewFilter directory implements a T5 model for conditional generation. It uses a pre-trained T5 model to generate questions based on given input. The model architecture and parameters are defined in the config.json file.
checkpoints.DistilledStreetviewFilter_T5LargeTeacher.checkpoint-54000
File Summary
config.json This code snippet is part of the FDT5_Model repository. It includes key files such as model.py, dataset.py, and single_inference.py. The main role of this code is to provide functionality for training and using a T5-based model for conditional generation. It makes use of dependencies such as the Transformers library and a pre-trained checkpoint for the model. The code allows for dataset processing, model training, and inference to generate text based on input data.
results
File Summary
USA_Pittsburgh.json This code snippet is part of the FDT5_Model repository. It contributes to the architecture by providing functionalities for street view filtering and classification. It includes key files such as dataset.py, model.py, and utils_classifier.py. The codebase has dependencies and uses software like TensorFlow. The repository layout consists of directories for street view images, checkpoints, and results. The results directory contains JSON files for different cities like USA_Pittsburgh.json.
USA_Orlando.json The code snippet in the FDT5_Model repository is responsible for analyzing streetview images and generating prompts for users to engage in conversation about the images. The code achieves this by using a model to generate questions based on the images, allowing users to discuss various aspects such as architectural features, landmarks, local businesses, and traffic flow. The codebase includes key files such as model.py, dataset.py, and single_inference.py, which are used for model training, data preparation, and inference, respectively. The repository also contains relevant directories for checkpoints, streetview images, and result files.
Taiwan_Kaohsiung.json This code snippet is part of the FDT5_Model repository and is responsible for generating captions for street view images. It utilizes a model and dataset to provide summaries of key aspects of the images, such as cleanliness, architecture, landmarks, and businesses in the area.
USA_NewYork.json The code snippet in the single_inference.py file of the FDT5_Model repository is responsible for generating questions about street views in different cities. It uses a pre-trained model to generate various questions about the streets and landmarks observed in the images. The code processes the street view images, extracts features, and utilizes natural language processing techniques to generate descriptive and thought-provoking questions. The generated questions can be about the architecture, businesses, landmarks, traffic, and other aspects of the streets in various cities. The code enables users to gain insights into the characteristics of different streets and engage in interactive discussions.

πŸš€ Getting Started

βš™οΈ Installation

  1. Clone the FDT5_Model repository:
git clone https://github.com/kennysuper007/FDT5_Model
  1. Change to the project directory:
cd FDT5_Model

πŸ€– Running FDT5_Model

Use the following command to run FDT5_Model:

python3 single_inference.py --coordinate 22.6408282,120.3222442 --location Taiwan_Kaohsiung
python3 single_inference.py --coordinate 40.440309,-80.0 --location USA_Pittsburgh
python3 single_inference.py --coordinate 28.541323,-81.380703 --location USA_Orlando
python3 single_inference.py --coordinate 40.73055,-74.001715 --location USA_NewYork

🀝 Contributing

Contributions are welcome! Here are several ways you can contribute:

Contributing Guidelines
  1. Fork the Repository: Start by forking the project repository to your GitHub account.
  2. Clone Locally: Clone the forked repository to your local machine using a Git client.
    git clone <your-forked-repo-url>
  3. Create a New Branch: Always work on a new branch, giving it a descriptive name.
    git checkout -b new-feature-x
  4. Make Your Changes: Develop and test your changes locally.
  5. Commit Your Changes: Commit with a clear and concise message describing your updates.
    git commit -m 'Implemented new feature x.'
  6. Push to GitHub: Push the changes to your forked repository.
    git push origin new-feature-x
  7. Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.

Once your PR is reviewed and approved, it will be merged into the main branch.


πŸ“„ License

This project is protected under the MIT License. For more details, refer to the LICENSE file.


πŸ‘ Acknowledgments

  • I modified the inference commands to demonstrate the capabilities of the FDT5 model trained by Nicholas.

About

A tiny model to generate engaging questions from vehicle surround views

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages