As I mentioned in the class, LLMs are great at generating text, but not all text can be easily understood in a glance. For example, compare the following descriptions of a product; both carry the same amount of information, but which one do you prefer?

image.png

Description 1

The Apple Watch Ultra 3 is built with a Grade 5 titanium case available in natural or black, measuring 49mm in height, 44mm in width, and 12mm in depth, with a flat sapphire crystal Always-On Retina OLED display offering 422×514 pixels at 326 ppi and up to 3000 nits peak brightness. It weighs 61.6 grams in natural titanium or 61.8 grams in black and fits wrists sized 130–210mm. Powered by the S10 chip with a dual-core 64-bit processor, a 4-core Neural Engine, and 64GB of storage, it features a customizable Action button, Digital Crown with haptic feedback, side button, double-tap and wrist flick gestures, and Siri with on-device processing. Sensors include electrical and optical heart sensors, a blood oxygen sensor, temperature sensors, depth gauge, water temperature sensor, compass, always-on altimeter, high-g accelerometer, HDR gyroscope, and ambient light sensor. Health features cover the Blood Oxygen app, ECG, Cycle Tracking with ovulation estimates, heart rate monitoring with high/low and irregular rhythm notifications, as well as apps for medications, mindfulness, noise, sleep tracking (with apnea detection), and hypertension notifications. Audio is handled by a three-microphone array with wind noise reduction and dual speakers, while safety tools include satellite connectivity for SOS, Messages, and Find My, plus international emergency calling, Crash and Fall Detection, siren, flashlight, and Backtrack GPS. With a durable design tested to MIL-STD 810H standards, it is water-resistant to 100m, IP6X dust-resistant, and supports recreational scuba diving to 40m. Connectivity options include Wi-Fi 4 (2.4/5GHz), Bluetooth 5.3, precision dual-frequency GPS, a second-generation Ultra Wideband chip, Apple Pay, and GymKit, while model A3281 adds LTE and 5G NR with international roaming. The battery lasts up to 42 hours with normal use or 72 hours in Low Power Mode, and supports fast charging (80% in ~45 minutes). Running watchOS, it supports a wide range of workouts—running, cycling, hiking, swimming, diving, and many more—offering advanced metrics like VO₂ max, stride length, cadence, power, and FTP with power meters, plus offline maps, waypoints, and Race Route features. Accessibility includes VoiceOver, Zoom, AssistiveTouch, RTT, Taptic Time, and more. Environmentally, it is made with 40% recycled materials (including 100% recycled titanium, cobalt, lithium, rare earths, tungsten, and steel), manufactured with renewable energy, and packaged in 100% fiber-based materials under Apple’s Zero Waste Program. The Apple Watch Ultra 3 requires an iPhone 11 or later with iOS 26, and ships with a band and a 1m Magnetic Fast Charger to USB-C cable.


Description 2

Category Specifications
Material & Finish Titanium case (Grade 5), Natural or Black
Size & Weight Height: 49mm · Width: 44mm · Depth: 12mm · Display: 422 × 514 pixels, 1245 sq mm area · Weight: 61.6g (Natural), 61.8g (Black) · Fits 130–210mm wrists
Controls & Buttons Customizable Action button · Digital Crown with haptic feedback · Side button · Double tap & wrist flick gestures · Siri (on-device processing)
Chip S10 chip with 64-bit dual-core processor · 4-core Neural Engine · 64GB storage
Sensors Electrical heart sensor · Optical heart sensor (3rd gen) · Blood oxygen sensor · Temperature sensor · Depth gauge · Water temperature sensor · Compass · Always-on altimeter · High-g accelerometer · HDR gyroscope · Ambient light sensor
Health Features Blood Oxygen app · ECG app · Cycle Tracking (with ovulation estimates) · Heart Rate app · Irregular rhythm, high/low heart rate notifications · Medications app · Mindfulness · Noise app · Sleep tracking & apnea detection · Hypertension notifications
Display Always-On Retina OLED (LTPO3) · 1Hz refresh · 326ppi · Flat sapphire crystal · Up to 3000 nits peak brightness (1 nit min) · Night Mode for Ultra faces
Audio Three-microphone array with beamforming & wind noise mitigation · Dual speakers · Siren · Media playback
Battery Up to 42h (normal use) · Up to 72h (Low Power Mode) · Built-in lithium-ion, fast-charge capable (80% in ~45min, 12h in 15min, 8h sleep tracking in 5min)
OS watchOS (apps for fitness, health, safety, connectivity)
Workouts & Sports Running, Cycling, Hiking, Swimming, Diving (to 40m), Multisport, plus Skiing, Snowboarding, Rowing, Yoga, HIIT, Golf, etc. · Advanced metrics (stride, cadence, power, VO₂ max, FTP, etc.) · Race Route & Pacer features · Automatic track detection · Offline maps & waypoints
Safety Satellite connectivity (SOS, Messages, Find My) · Emergency SOS & International calling · Crash Detection · Fall Detection · Siren · Flashlight · Backtrack GPS
Durability Water resistance 100m (ISO 22810:2010) · Dust resistance IP6X · Recreational diving to 40m (EN13319) · Tested to MIL-STD 810H (altitude, temp shock, vibration, immersion, etc.)
Connectivity Wi-Fi 4 (2.4GHz/5GHz) · Bluetooth 5.3 · Dual-frequency GPS (L1 & L5, GLONASS, Galileo, QZSS, BeiDou) · 2nd-gen Ultra Wideband chip · Apple Pay · GymKit · Model A3281: LTE + 5G NR support · International roaming
Accessibility VoiceOver, Zoom, AssistiveTouch, RTT, Live Speech, Taptic Time, Personal Voice, and more (vision, mobility, hearing, cognitive support)
Environmental 40% recycled materials · 100% recycled titanium, cobalt, lithium, rare earths, tungsten, steel · 100% renewable manufacturing energy · 100% fiber-based packaging · Zero Waste verified suppliers
Compatibility Requires iPhone 11 or later (incl. SE 2nd gen+) with iOS 26 or later
In the Box Apple Watch Ultra 3 · Band · Apple Watch Magnetic Fast Charger to USB-C Cable (1m)

Processing structured text is more straightforward because we can quickly locate information. In the second description you can immediately learn about the battery life of the Apple Watch by looking at the row that says "Battery", but in the first description you'd have to read the paragraph until you see the information on battery life.

Locating information is also predictable with structured text. Let's say you are given several Apple Watch descriptions and would like to compare their battery life. With the second description you can simply go to the ninth row (”Battery") of each table whereas with the first description you'd have to read a paragraph for each product until it mentions the battery life information (it could be at the beginning, middle, or end of the paragraph).

LLMs Interacting with the Outside World

You’ve built customer support bots in the class. While your bot is already useful, it is extremely limited by the actions it can take. For example, it can talk about the company's refund policy, but it can't actually process a refund request on its own. Can we fix this restriction?

To add such extra capabilities to LLMs, we must remember that they can only generate text. If we want them to do anything, we should do it for them. Think of it like going to a doctor: Your doctor can write a prescription for you, but it's you who has to take the medication, the doctor won't do it for you. Similarly, an LLM can tell us what to do, and if we have a program that takes that request and runs it, then we'd have a system that can take actions with the LLM being the central brain.

This is where structured output becomes crucial: The LLM has to write its “prescription” in a way that computer programs understand, and if we know one thing about computers it's the fact that they can only process information that has an order/structure! For example, Excel doesn't let you write:

“this month's revenue is $15,000 but to get the profit we should subtract the expenses which amount to $4,000, so the profit would be $11,000.”

Instead, you're expected to put the revenue and costs in cells (e.g., A1 and A2) and then put the profit as a formula (like =A1-A2) in another cell (e.g., A3).

Similarly, LLMs must write their text outputs according to the rules and patterns of the program that they are interacting with. Fortunately, most industry has accepted JSON as the standard structure and LLMs are really good at generating JSON. I've talked about JSON in the class and we will see more examples in the future.


So an LLM that can generate JSON output instead of free-form text can interact with external programs that expect JSON inputs. But how does an LLM know what kind of JSON to generate? For example, let's say we have a program (a.k.a. “tool”) that processes purchases. It expects a JSON like so:

{
	"id": "550e8400-e29b-41d4-a716-446655440000",
	"name": "4K Ultra HD Monitor",
	"price": 399.99,
	"currency": "USD",
	"available": true,
	"tags": ["electronics", "display", "monitor"],
	"dimensions": {
		"width": 60.5,
		"height": 35.2,
		"depth": 5.5,
		"unit": "cm"
	},
	"category": "electronics",
	"release_date": "2025-09-15",
	"related_products": [{
		"id": "123e4567-e89b-12d3-a456-426614174000",
		"name": "HDMI Cable"
	}]
}

If we visualize this JSON using https://jsontree.vercel.app/, its hierarchical structure becomes clearer: