Overview
GPT-4 can solve difficult problems with greater accuracy, thanks to its broader general knowledge and problem solving abilities. Tabs Creativity Visual input Longer context GPT-4 is more creative and collaborative than ever before. It can generate, edit, and iterate with users on creative and technical writing tasks, such as composing songs, writing screenplays, or learning a user’s writing style.
Input Explain the plot of Cinderella in a sentence where each word has to begin with the next letter in the alphabet from A to Z, without repeating any letters. Output A beautiful Cinderella, dwelling eagerly, finally gains happiness; inspiring jealous kin, love magically nurtures opulent prince; quietly rescues, slipper triumphs, uniting very wondrously, xenial youth zealously. GPT-4 surpasses GPT-3.5 in its advanced reasoning capabilities. GPT-3.5 Input Andrew is free from 11 am to 3 pm, Joanne is free from noon to 2 pm and then 3:30 pm to 5 pm. Hannah is available at noon for half an hour, and then 4 pm to 6 pm. What are some options for start times for a 30 minute meeting for Andrew, Hannah, and Joanne? Output Here is a list of their availability:
- Andrew: 11 am to 3 pm
- Joanne: noon to 2 pm, and 3:30 pm to 5 pm
- Hannah: noon to 12:30 pm, and 4 pm to 6 pm
Based on their availability, there is a 30-minute window where all three of them are available, which is from 4 pm to 4:30 pm. So, the meeting can be scheduled at 4 pm. GPT-4 Input Andrew is free from 11 am to 3 pm, Joanne is free from noon to 2 pm and then 3:30 pm to 5 pm. Hannah is available at noon for half an hour, and then 4 pm to 6 pm. What are some options for start times for a 30 minute meeting for Andrew, Hannah, and Joanne? Output Andrew: 11 am - 3 pm Joanne: 12 pm - 2 pm, 3:30 pm - 5 pm Hannah: 12 pm - 12:30 pm, 4 pm - 6 pm
Common availability for a 30-minute meeting: 12 pm - 12:30 pm GPT-4 outperforms GPT-3.5 by scoring in higher approximate percentiles among test-takers. Uniform Bar Exam 10th GPT-3.5
90th GPT-4
Biology Olympiad 31st GPT-3.5
99th GPT-4 (with vision)
Following the research path from GPT, GPT-2, and GPT-3, our deep learning approach leverages more data and more computation to create increasingly sophisticated and capable language models. GPT-4 Line Motif We spent 6 months making GPT-4 safer and more aligned. GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on our internal evaluations. Safety & alignment Training with human feedback We incorporated more human feedback, including feedback submitted by ChatGPT users, to improve GPT‑4’s behavior. We also worked with over 50 experts for early feedback in domains including AI safety and security.
Continuous improvement from real-world use We’ve applied lessons from real-world use of our previous models into GPT‑4’s safety research and monitoring system. Like ChatGPT, we’ll be updating and improving GPT‑4 at a regular cadence as more people use it.
GPT‑4-assisted safety research GPT‑4’s advanced reasoning and instruction-following capabilities expedited our safety work. We used GPT‑4 to help create training data for model fine-tuning and iterate on classifiers across training, evaluations, and monitoring.