The good news ... for experts in a wide range of other fields ... is that new employment opportunities are just over the horizon for computer savvy personnel who learn how to use generative AI to solve complex problems in their fields. So how can they learn these new skills? Click Here to read more ...
For slower moving technologies, taking courses based on textbooks has traditionally been the most effective way to learn how to use new technologies. But genAI is developing faster than publishers can update their textbooks. So the best way to keep up with such rapid innovation is to keep abreast of the latest use cases.
- The good news is that the tech companies that create innovations usually include use cases for their innovations in their presentations at conferences and other assemblies.
- The bad news is that there is no one place that anyone go to find these use cases after the conferences and other assemblies are over. So it is difficult for a prospective user to compare similar innovations. For example, which AI tool is easier to use and/or provides more accurate summaries of PDF files.
- Our nation needs at least one reputable organization to provide a badly needed public service for the next two or three years by acting as a curator of use cases, a librarian who maintains up-to-date online catalogs of cases with copies of the data embodied in these cases in “standard” formats and/or allows visitors to the catalog to run the use cases on their own data.
- The curator/librarian should not rate the performance of an app in a use case. However the visitors to the catalog should be encouraged to provide their own assessments, perhaps using the same kind of comments plus four star, three star, two star, one star ratings provided by purchasers on Amazon’s online shopping sites.
- Visitors to the catalog would be required to have appropriate subscriptions to run the cases. For example, if a visitor wanted to run an OpenAI use case that demonstrated a feature that required a ChatGPT Plus subscription, the visitor would need a Plus subscription to run the case.
- In order to close the gap between the new opportunities that are about to appear and the limited current supply of qualified applicants as quickly as possible, the curator/librarian should invest substantial efforts to solicit use cases from pioneers in academia and non-academic training institutions who have already developed their own use cases. These early cases will enable other instructors to learn the new skills themselves and then teach their students the new skills using the same use cases.
The public librarian should also maintain catalogs of benchmark tests. Benchmark tests of large language models are designed to assess the overall power of an AI model. Whereas use cases, as their name implies, are designed to show how specific innovations in genAI models can be used effectively.
- The good news is that many reputable institutions are already providing open source benchmark tests that assess the overall power of big Tech’s models. A few of these institutions are listed in Appendix 3 to this note.
- The bad news, again, is that there is no one place that anyone can go to find out which tests exist, which models have been tested, and the results of these tests.
- The nation needs at least one reputable organization to act as a curator of a collection of existing benchmark test results, a librarian that maintains up-to-date catalogs of all the test results of the models in the library plus links to later test results on later versions of the models.
Ever since OpenAI announced ChatGPT running on GPT-4 back in March 2023, the New York Times has mostly published what might charitably be called “old news” rewrites of Big Tech press releases 2 to 3 days after these stories had been covered by TechCrunch, Wired, The Verge, and other tech publications.
The Times just signed an agreement with Amazon that enables Amazon to use a substantial portion of the millions of articles published by the Times. The terms and fees were not disclosed in the announcement of the agreement, but the price was evidently high enough for the Times to permit Amazon to train its generative AI models on the Times publications.
- This agreement means that the venerable New York Times is no longer sitting in the stands observing generative AI from a distance. The Times is definitely out of the stands and onto the playing field.
- That’s why the editor of this blog thinks that the Times’ well educated subscribers have reason to expect that it might begin to cover the incredibly important generative AI revolution in a more responsible manner asap not just with better articles, but by offering them more substantial benefits, e.g., access to catalogs of use cases and benchmark tests.
- This does not mean that the Times should make a profit from its management of genAI catalogs like it does from its Wirecutter guide to purchasing products. Subscribers should get unlimited access to its genAI catalogs for free. But non-subscribers should face a very low paywall, for example, 10 to 15 free inquiries per week.
World shaking events
The Times takes pride in its commitment to keeping its well educated readers informed about the world’s most important issues, whether they involve climate change, the sudden emergence of authoritarian regimes, rising instability in trade relationships, you name it. Yes, these are very important issues that deserve extensive and intensive coverage by the Times.
But the world shaking magnitude of the potential positive and negative impacts of generative AI as it evolves into artificial general intelligence (AGI), faster than anyone had predicted just a few years ago, is at least as important as any other issue that anyone can put on the table.
Unique qualifications
The New York Times is one of the only publications of any kind that can provide extensive and intensive coverage of the game changing developments in generative AI that are popping up at a faster and faster pace. Its stories about genAI will provide valuable indicators of new use cases that should be added to its catalogs. And its substantial subscriber base will make it relatively immune to pressure from advertisers.
When the CDC failed to provide timely and reliable national COVID-19 data in early 2020, The Atlantic stepped in with The COVID Tracking Project, launched on March 7, 2020.
What began as a volunteer effort quickly became the country’s most trusted and widely cited source of pandemic data, the gold standard — used by public health agencies, media outlets, researchers, and even the federal government.
- Testing data (total tests, test results, and positivity rates
- Confirmed COVID-19 cases, by state and nationally
-
Hospitalization data, including current hospitalizations and ICU usage
-
Death counts, including fatalities in long-term care facilities
-
Racial and ethnic breakdowns of cases and deaths
-
Outbreak data from nursing homes and assisted living centers
The project’s reliability stemmed from its collaboration with:
-
Public health officials
- Epidemiologists
- Health data scientists
- Veteran journalists and data analysts from The Atlantic
These advisors helped shape the methodology, ensuring that the data met professional standards even as it was gathered and published by a largely volunteer force
The editor of this blog provided ChatGPT on GPT-4o and Meta AI on Llama 4 with the same prompt:
- Provide a summary of the percentage of its income that the NY Times derived from advertising and subscriptions in each of the following years -- 2000, 2004, 2008, 20012, 2016, 2020, 2024 ... and note when the Times first posted a paywall.
He used the response from each chatbot to check the other chatbot’s response. Their responses were more or less the same because they consulted almost the same sources. Meta AI’s response is quoted below because its response was more concise.
Response from Meta AI on Llama 4
"The New York Times has undergone significant changes in its revenue model over the years. Here's the breakdown of advertising versus subscription income for the specified years:
- 2004:
- Advertising: $1,222 million (61% of total revenue)
- Print Circulation: $616 million (31% of total revenue)
- Digital Subscription: Not available (N/A)
- 2008:
- Advertising: $1,068 million (56% of total revenue)
- Print Circulation: $668 million (35% of total revenue)
- Digital Subscription: N/A
- 2012:
- Advertising: $712 million (45% of total revenue)
- Digital Subscription: $114 million (7% of total revenue)
- Print Circulation: $681 million (43% of total revenue)
- 2016:
- Advertising: $581 million (37% of total revenue)
- Digital Subscription: $232 million (15% of total revenue)
- Print Circulation: $647 million (42% of total revenue)
- 2022 (latest available data):
- Advertising: $523 million (23% of total revenue)
- Digital Subscription: $979 million (42% of total revenue)
- Print Circulation: $574 million (25% of total revenue)
As for 2024, the data isn't available yet. However, we can see a clear trend of increasing revenue from digital subscriptions and decreasing revenue from advertising.
The New York Times introduced a paywall on its online edition in 2011, launching its digital subscription model. This move marked a significant shift in its revenue strategy, with digital subscription revenue growing from $47 million in 2011 to $979 million in 2022."
Primary Source = https://www.visualcapitalist.com/cp/new-york-times-revenue-chart/
Organization |
Benchmarks(s) |
Primary Focus Area |
MLCommons |
MLPerf (Training, Inference) |
System performance throughput, latency for cloud & edge |
Epoch AI |
FrontierMath & others via AI Benchmarking Dashboard |
Challenging tasks across AI capabilities |
Vellum AI |
SOTA leaderboards (GPQA, reasoning, non-saturated evals) |
Logic/reasoning tracking cutting-edge LLM performance |
LiveBench |
Real-time LLM evals |
Reliable, contamination-aware model evaluation |
Center for AI Safety |
Humanity’s Last Exam |
Multimodal reasoning at the frontier of human knowledge |
SCALE AI |
AgentBench, Bench, Bench (via community platforms) |
Agent reliability, multi-agent coordination, ML problem solving |
PerfKitBenchmarker |
Big data, HPC, networking, storage benchmarks |
Infrastructure metrics CPU, I/O, network across clouds |
Other Key Suites |
MMLU, GLUE/SuperGLUE, GSM8K, SQuAD |
General language understanding & reasoning abilities |
Official Links to Benchmark Organizations
- MLCommons (MLPerf) https://mlcommons.org
- Epoch AI (AI Benchmarking Dashboard)
https://www.epochai.org/benchmarks - Vellum AI (SOTA leaderboards)
https://www.vellum.ai/leaderboard - LiveBench (Contamination-aware model evaluation)
https://livebench.org - Center for AI Safety (Humanity’s Last Exam)
https://www.safe.ai - SCALE AI (AgentBench, SWE-Bench, τ-Bench)
https://scale.com
AgentBench Info
SWE-Bench
TauBench (τ‑Bench) - PerfKitBenchmarker (by Google)
https://github.com/GoogleCloudPlatform/PerfKitBenchmarker - MMLU, GLUE/SuperGLUE, GSM8K, SQuAD
No comments:
Post a Comment
Your comments will be greatly appreciated ... Or just click the "Like" button above the comments section if you enjoyed this blog note.