Reports, White papers and eBooks

The new report on GPT-4, a large-scale, multimodal model that supports image and text inputs and outputs (Techatty)

Open-AI - We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformerbased model pre-trained to predict the next token in a document.

Techatty

Jan 6, 2024 - 16:00

Jan 13, 2024 - 12:25

The new report on GPT-4, a large-scale, multimodal model that supports image and text inputs and outputs (Techatty)

GPT-4 report in PDF format for free download

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformerbased model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide
range of scales. This allowed us to accurately predict some aspects of GPT-4’s performance based on models trained with no more than 1/1,000th the compute of GPT-4.

1 Introduction
This technical report presents GPT-4, a large multimodal model capable of processing image and text inputs and producing text outputs. Such models are an important area of study as they have the
potential to be used in a wide range of applications, such as dialogue systems, text summarization,
and machine translation. As such, they have been the subject of substantial interest and progress in
recent years [1–34].

One of the main goals of developing such models is to improve their ability to understand and generate
natural language text, particularly in more complex and nuanced scenarios. To test its capabilities
in such scenarios, GPT-4 was evaluated on a variety of exams originally designed for humans. In
these evaluations it performs quite well and often outscores the vast majority of human test takers.
For example, on a simulated bar exam, GPT-4 achieves a score that falls in the top 10% of test takers.
This contrasts with GPT-3.5, which scores in the bottom 10%.

On a suite of traditional NLP benchmarks, GPT-4 outperforms both previous large language models
and most state-of-the-art systems (which often have benchmark-specific training or hand-engineering).
On the MMLU benchmark [35, 36], an English-language suite of multiple-choice questions covering
57 subjects, GPT-4 not only outperforms existing models by a considerable margin in English, but
also demonstrates strong performance in other languages. On translated variants of MMLU, GPT-4
surpasses the English-language state-of-the-art in 24 of 26 languages considered. We discuss these
model capability results, as well as model safety improvements and results, in more detail in later
sections.

This report also discusses a key challenge of the project, developing deep learning infrastructure and
optimization methods that behave predictably across a wide range of scales. This allowed us to make
predictions about the expected performance of GPT-4 (based on small runs trained in similar ways)
that were tested against the final run to increase confidence in our training.
Despite its capabilities, GPT-4 has similar limitations to earlier GPT models [1, 37, 38]: it is not fully
reliable (e.g. can suffer from “hallucinations”), has a limited context window, and does not learn
∗
Please cite this work as “OpenAI (2023)". Full authorship contribution statements appear at the end of the
document. Correspondence regarding this technical report can be sent to gpt4-report@openai.com
arXiv:2303.08774v3 [cs.CL] 27 Mar 2023 from experience. Care should be taken when using the outputs of GPT-4, particularly in contexts where reliability is important.

GPT-4’s capabilities and limitations create significant and novel safety challenges, and we believe
careful study of these challenges is an important area of research given the potential societal impact.
This report includes an extensive system card (after the Appendix) describing some of the risks we
foresee around bias, disinformation, over-reliance, privacy, cybersecurity, proliferation, and more.
It also describes interventions we made to mitigate potential harms from the deployment of GPT-4,
including adversarial testing with domain experts, and a model-assisted safety pipeline.

2 Scope and Limitations of this Technical Report
This report focuses on the capabilities, limitations, and safety properties of GPT-4. GPT-4 is a
Transformer-style model [39] pre-trained to predict the next token in a document, using both publicly
available data (such as internet data) and data licensed from third-party providers. The model was
then fine-tuned using Reinforcement Learning from Human Feedback (RLHF) [40]. Given both
the competitive landscape and the safety implications of large-scale models like GPT-4, this report
contains no further details about the architecture (including model size), hardware, training compute,
dataset construction, training method, or similar.
We are committed to independent auditing of our technologies, and shared some initial steps and
ideas in this area in the system card accompanying this release.2 We plan to make further technical
details available to additional third parties who can advise us on how to weigh the competitive and
safety considerations above against the scientific value of further transparency.

3 Predictable Scaling
A large focus of the GPT-4 project was building a deep learning stack that scales predictably. The
primary reason is that for very large training runs like GPT-4, it is not feasible to do extensive
model-specific tuning. To address this, we developed infrastructure and optimization methods that
have very predictable behavior across multiple scales. These improvements allowed us to reliably
predict some aspects of the performance of GPT-4 from smaller models trained using 1, 000× –
10, 000× less compute.

3.1 Loss Prediction
The final loss of properly-trained large language models is thought to be well approximated by power
laws in the amount of compute used to train the model [41, 42, 2, 14, 15].
To verify the scalability of our optimization infrastructure, we predicted GPT-4’s final loss on our
internal codebase (not part of the training set) by fitting a scaling law with an irreducible loss term
(as in Henighan et al. [15]): L(C) = aCb + c, from models trained using the same methodology
but using at most 10,000x less compute than GPT-4. This prediction was made shortly after the run
started, without use of any partial results. The fitted scaling law predicted GPT-4’s final loss with
high accuracy (Figure 1).

3.2 Scaling of Capabilities on HumanEval
Having a sense of the capabilities of a model before training can improve decisions around alignment,
safety, and deployment. In addition to predicting final loss, we developed methodology to predict
more interpretable metrics of capability. One such metric is pass rate on the HumanEval dataset [43],
which measures the ability to synthesize Python functions of varying complexity. We successfully
predicted the pass rate on a subset of the HumanEval dataset by extrapolating from models trained
with at most 1, 000× less compute (Figure 2).

For an individual problem in HumanEval, performance may occasionally worsen with scale. Despite
these challenges, we find an approximate power law relationship −EP [log(pass_rate(C))] = α∗C−k

2
In addition to the accompanying system card, OpenAI will soon publish additional thoughts on the social
and economic implications of AI systems, including the need for effective regulation.

Download the full GPT-4 Report in PDF format below. Powred by Open-AI.

Download files

Tags:

Redwire Announces Live Event with Main Engine Cut Off Podcast, Previews Full Sch...

Techatty Connecting the world of tech differently! Read. Write. Learn. Thrive. Make an informed decision without distractions. We are building tech media and publication networks to connect YOU and everyone to reliable information, opportunities, and resources to achieve greater success.

Sponsor to Give Hope, Transform, and Uplift Lives.

	Need help implementing innovative technology, with tech support or management? You can count on us.
	24-7 Press Release - Let's distribute your Press Releases to traditional and digital media outlets. Get started!
	Reliable Website Security Solutions, built for small businesses, web professionals, and enterprise organizations.
	Paternity Lab - bringing DNA Paternity Testing closer to people. We offer accurate, affordable, and easy DNA Paternity Testing. Also at home.
	Rexing USA - exclusive cameras, car gadgets, and EV accessories with unique designs, innovative technology, and in affordable price ranges.

The Rising Wave of Blockchain Technology Adop...

HackaTRON Season 7 Launches With Google Cloud...

Skybridge Founder: Kamala Harris Open-Minded ...

Auradine Ships 3nm Teraflux Bitcoin Mining Pl...

Wazirx Details Plan to Resume Withdrawals and...

Agentic AI Leaders to Showcase Latest Advance...

NVIDIA Releases NIM Microservices to Safeguar...

How AI Is Enhancing Surgical Safety and Educa...

NVIDIA and IQVIA Build Domain-Expert Agentic ...

AI Gets Real for Retailers: 9 Out of 10 Retai...

Alleged Co-Founder of Garantex Arrested in India

Feds Link $150M Cyberheist to 2022 LastPass H...

Who is the DOGE and X Technician Branden Spikes?

Notorious Malware, Spam Host “Prospero” Moves...

U.S. Soldier Charged in AT&T Hack Searched “C...

The new report on GPT-4, a large-scale, multimodal model that supports image and text inputs and outputs (Techatty)

Download the full GPT-4 Report in PDF format below. Powred by Open-AI.

Download files

Tags:

Redwire Announces Live Event with Main Engine Cut Off Podcast, Previews Full Sch...

April Showers Bring 23 New GeForce NOW Games, Including ‘Have a Nice Death’

What is Haptics technology?

The must-read cybersecurity report of 2023

History, origins, and the future of the internet revolu...

Change language

SPONSORED

Recommended for you

Great Opportunity You Can't Reject! (No, Seriously...

Pause and let's talk about responsible spending an...

Experts Estimate £20 Million+ Loss from Heathrow A...

Welcome to ProtoPie

Ready to turn your innovative tech business dream ...

Gold Could Surge to $40,000 per Ounce, Strategist ...

Web & Cloud - Engineering Tech for a Better Tomorrow!

Introducing: Techatty Aerospace

The new report on GPT-4, a large-scale, multimodal model that supports image and text inputs and outputs (Techatty)

Download the full GPT-4 Report in PDF format below. Powred by Open-AI.

Download files

Tags:

Related Posts

Change language

SPONSORED

Recommended for you