I'm excited to announce that I've uploaded AutoMarineSI to GitHub. AutoMarineSI is a Python script that utilizes OpenAI's GPT3/4 to infer possible future accidents from inputted near-miss reports. In this blog post, I'll provide a brief overview of its structure and development background.
How Does It Work?
The concept diagram for AutoMarineSI is shown below. (1) It extracts past similar accident cases from the content of the near-miss report entered by the user and (2) infers possible accidents based on these cases. It then outputs the results along with safety measures.
(1) Procedure for Extracting Similar Past Accidents
Preparation:
Using AddEmbedding.py, the causes of accidents recorded in past accident data are converted into embedding vector data. The model in use is OpenAI's "text-embedding-ada-002" by default. The vector data, along with other items from the accident data such as time and place of occurrence and report URL, need to be registered as metadata in the Pinecone database. ToPinecone.py is used for this registration.
Data Extraction:
AutoMarineSI.py handles the processing of the entered near-miss reports. First, the report is converted into embedding vector data. Then, it is compared with the previously registered accident data in Pinecone, and similar cases are extracted using cosine similarity. AutoMarineSI.py extracts a predefined number of similar cases. The number of extractions may need to be adjusted depending on the amount of past accident data (although this has not been validated).
(2) Inferring Future Accident Cases and Safety Measures
AutoMarinSI.py provides GPT with the prompt, the inputted near-miss report, and the metadata of the extracted similar accident data. The value returned by GPT is then displayed. "Foreseeable accident summary:" represents the accidents inferred by GPT, and "Safety measures:" signifies the safety precautions. Subsequently, it can display the top five similar accident cases, which can be used to verify if GPT has associated unrelated accidents.
Why Use Past Accident Data?
In my area of expertise, vessel operations, it's commonly understood that accidents are not caused by a single factor, but by a convergence of multiple factors. Conversely, if even one factor from a near-miss aligns with a factor from past accidents, it signifies a potential for the same type of accident to occur.
Accident reports often contain the causes of the accidents in the form of summaries or timelines. If we can search for similarities, we might be able to predict accidents that could potentially arise from near-misses. This was the inspiration behind creating AutoMarineSI.
What to Expect Going Forward
While we have yet to verify whether AutoMarineSI can truly foresee accidents, it's necessary for companies like MarineSI, who receive close call reports from ships, to compile this information and provide feedback. This process closely mirrors what AutoMarineSI does, leading us to believe that at the very least, it can serve as an assistant for administrative tasks.
AutoMarineSI is still a prototype, and there is room for verification and consideration in its prompts and loading procedures. For instance, after extracting similar accident cases, if those turn out to be serious incidents, it could be possible to access the report's URL for further detailed inference.
In addition, there are limits to what can be achieved with my ideas alone. I hope to further develop it with the assistance of software engineers and domain experts.
Acknowledgements
The inception of AutoMarineSI was inspired by Satoshi Nakajima's email newsletter, "Weekly Life is Beautiful". This provided me with the realization of the potential application of Embeddings + ChatGPT, and he was kind enough to answer my subsequent questions in his newsletter. This encounter with the combination of Embeddings + ChatGPT undeniably marked a transformative moment for me. I extend my sincere gratitude to Mr. Nakajima.
In creating the script, I referred to the OpenAI cookbook and also drew upon the capabilities of ChatGPT. My deepest thanks go out to the developers and everyone involved in these technological advancements.