Neoskeptics: Developing reliable specialized large language models (LLMs) for the multi billion dollar online/hybrid education market

Last update: Friday 5/12/23

Why did OpenAI release ChatGPT with the GPT-3.5 model, then with GPT-4, even though both models were prone to factual errors, bias, and "hallucinations". Open AI claimed that it did so, even at the risk of being banned in some countries (e.g., Italy temporarily), in order to learn from widespread user experience with the flawed models.

A. Enhanced success of OpenAI's stock offering

Whatever OpenAI's real motivations, a limited release would have attracted a tiny fraction of the publicity that was generated by the widespread initial distribution, publicity that enabled OpenAi to quickly complete a stock offering that valued the company at $27 billion to $29 billion:

"OpenAI closes $300M share sale at $27B-29B valuation", Jagmeet Singh, Ingrid Lunden, TechCrunch, 4/28/23

B. Ask me anything

The fact that ChatGPT acquired over 100 million users within a few months fueled the success of OpenAi's stock offering. Its users were dazzled by the apparent scope and depth of ChatGPT's knowledge. It seemed to know everything about everything. Yes, it sometimes made statements that were wrong, but so did most people. And yes, it sometimes made assertions that were value judgements with the same certainty as its statements of facts, but so did most people. But how many people could summarize the Bill of Rights -- the first ten amendments to the U.S. Constitution -- in the form of a Shakespearean sonnet within five seconds? Not many, if any, so ChatGPT's millions of users were dazzled. (Note: Here's a link to a ChatGPT summary)

C. Second thoughts

OpenAI explained that its GPT-4 model yielded better results than GPT-3 because "four" was based on far more data than "three". From this explanation, many users inferred that as the data files underlying subsequent models grew larger and larger, the remaining flaws in GPT-4 would be reduced, perhaps eliminated ... but would they? Sam Altman, OpenAI's CEO, recently rejected this inference:

"OpenAI’s CEO Says the Age of Giant AI Models Is Already Over" ... “We'll make them better in other ways", Will Knight, Wired, 4/19/23

Unfortunately, Altman did not suggest alternative strategies. The rest of this note will discuss a collection of lucrative use cases that would be well served by smaller, more specialized models. The editor of this blog is not an AI expert, so his options will be sketched at a high, 10,000 foot conceptual level.

D. Focus on limited, legally accessible information

Most of the use cases for GPT-4 reported in the media have involved chatbot responses to questions involving school or work-related related issues

In a non-work, non-school context, we might be comfortable with chatbots that seemed to know everything about everything, but sometimes uttered false statements, just like our real friends and associates, e.g., providing us with recipes that tasted horrible, suggesting physical workout schedules that weren't sustainable, or providing advice about relationships that was ineffective.
By contrast, our responsibilities as students or professionals in whatever field we studied or were employed would not require that our chatbot assistants know everything about everything; they would only have to know everything about the subjects related to our school assignments or work tasks.
Most users would prefer chatbots that based their assertions on data to which their users had legitimate access. Information that violated someone else's personal rights or copyrights/patents might be useful in the short term, but would entail longer term risks of costly penalties for users who we found themselves entangled in lawsuits or facing criminal charges

In the next section, the editor will propose an alternative cluster of options based on LLMs that would be much smaller than GPT-4's, but would produce more reliable results that were based on information that was acquired through voluntary, binding agreements with the information owners. These options were selected because of their similarities to the most dazzling use cases so far, software engineering, cases that have not been plagued by unreliable results, biases, or hallucinations.

ChatGPT can write reliable original code and debug flawed code submitted for its review. Why? Because its training data only contained a tiny percentage of error and was presented in a context that embodied minimum gender, racial, national, or other kinds of bias.

Note: The editor is not suggesting that LLMs that respond to questions that did not relate to jobs or academic studies will not be developed. However, the biggest drivers of LLM development are hugely profitable ginormous corporations -- Google, Microsoft, Amazon, Baidu, Ali Baba -- all of whom want to use LLMs to earn the biggest earnings from the lowest investment costs while being subjected to the least regulations and incur minimum losses in lawsuits. In other words, the biggest developers will want to grab the most highly profitable low hanging fruit first.

E. Reliable specialized LLMs

This discussion proposes that LLMs focus on STEM+M+D, law, and medicine, i.e., on Science, Technology, Engineering, and Mathematics plus the algorithmic components of Management science (e.g., supply chain management, risk assessment, project management), Data science/analysis, law, and medicine.

Publications from reputable publishers in STEM+M+D, law, and medicine share important qualities with articles and books about the software engineering component of technology, the "T" in STEM+M+D. Therefore LLMs trained on their publications should also be highly reliable.

High quality publications are accessible via the Internet from reputable publishers for free or via paid subscriptions. Redistribution fees can usually be negotiated with the publisher/owners, so copyrights need not be infringed.
Individuals whose behavior is discussed in these publications are usually not identified, so privacy rights are not violated.
The high quality of the information in the publications from reputable publishers of textbooks and journal articles in STEM+M+ D, law, and medicine has been attained by rigorous peer reviews, editorial guidance, and extensive fact checking.

F. Low hanging fruit = The multi-billion dollar online/hybrid market for academic degrees and professional certificates in STEM+M+D, law, and medicine
The most lucrative sectors of the online/hybrid education market offer degree and certificate programs for undergraduate students, graduate students, and practicing professionals. The highest tuition and fees are charged for programs in STEM+M+D, law, and medicine; and the lion's share of online academic degree courses and online professional certificate courses are offered in these fields.

Online courses are fully online, i.e., they have no classroom components; whereas hybrid courses have substantial online and classroom components
It is widely conceded that the best face-to-face courses in the world are still better than the best online courses; and the best hybrid courses are best of all

But the best face-to-face courses and the best hybrid courses are only accessible to students enrolled on the elite campuses wherein these courses are taught; by contrast, the best online courses are accessible to students anywhere in the world. Unfortunately, online courses require more self-discipline, better time management skills, and more effective study habits than face-to-face or hybrid courses

However, the addition of reliable, specialized LLMs that provide customized one-on-one tutoring will help students in hybrid courses cover more material in between classroom sessions. Indeed, the deployment of reliable, specialized LLMs will facilitate the conversion of most traditional courses into hybrid formats. Furthermore, these instantly accessible LLM tutors will also make it easier for more students to become proficient learners in solitary online environments.

In summary, we should anticipate that the deployment of reliable, specialized LLMs will greatly expand the market for online and hybrid courses in STEM+M+D, law, and medicine that are components of degree and certificate programs for undergraduate students, graduate students, and practicing professionals.

Question -- How large will the online/hybrid education market be for academic degrees and professional certificates in STEM+M+D, law, and medicine within the next few years?

Answer #1 (low ball estimate) -- At least $267 billion by 2027

According to a report posted by Statista, the world-wide market for online courses should reach $167 billion in 2023 and rise by $72 billion, i.e., by 43%, to $239 billion by 2027.

These forecasts were calculated in October 2022, before the initial public distribution of GPT 3.5, followed by the far more impressive GPT 4. Accordingly, they do not embody the potential boost that more reliable specialized LLMs might bestow. If we assumed a modest 17 percent boost from better LLMs, the forecast for the online market would rise by 60%, i.e., $100 billion, to about $267 billion in 2027.

Answer #2 (a higher low ball estimate) -- At least $534 billion by 2027

The previous estimates only covered online courses; they did not cover hybrid courses. At the present time online and hybrid courses occupy small shares of the total market for undergraduate degree, graduate degree, and professional certificate courses; the vast majority of such courses are still face-to-face. In principle, every face-to-face course could be upgraded to hybrid status by adding access to LLM's. But there won't be much movement in that direction with error-prone, biased, hallucinating LLM's like GPT 4.

However the development of reliable, specialized LLMs is likely to trigger tsunamis of upgrades in STEM+M+D, law, and medicine courses. Indeed, students in these fields will demand access to the new LLMs. Colleges, universities, and other educational entities that don't provide such access will be penalized by substantial reductions in their share of their sectors of the education market.

Nevertheless, out of an abundance of caution, we propose another modest, low ball estimate. We suggest that the number of students enrolled in hybrid courses that provide access to reliable specialized LLMs in 2027 will be at least as many as those enrolled in online courses, i.e., a $267 billion market for hybrid courses. Therefore the combined hybrid/online market will be at least $267 + $267 = $534 billion

G. "Will I hallucinate?"

One of the most troublesome of GPT-4's deficiencies has been its occasional emission of "hallucinations", i.e., wildly invalid statements that seem to come from out of nowhere. There is reason to believe that ordinary errors and bias reflect errors and bias in the data on which GPT-4 was trained. But where do hallucinations come from?

Readers who have seen Stanley Kubrick's classic film "2001" will remember the touching scene wherein HAL -- the AI system that supported all of the astronauts' activities -- learned that it will be turned off. HAL asked anxiously, "Will I dream?" Hal's cyber genius developer calmly responded that Hal will dream because all intelligent entities dream. The corresponding question for a reliable, specialized LLM would be "Will I hallucinate?" Only time will tell, time and extensive testing before its responsible developers release it to the public.

____________________________________

Links to related notes on this blog:

"Useful AI News", Updated every day

Neoskeptics

Pages

Wednesday, May 17, 2023

Developing reliable specialized large language models (LLMs) for the multi billion dollar online/hybrid education market

No comments:

Post a Comment