Weekly recap: 2023-08-13

Posted by Q McCallum on 2023-08-13

What you see here is the last week’s worth of links and quips I have shared on LinkedIn, from Monday through Sunday.

For now I’ll post the notes as they appeared on LinkedIn, including hashtags and sentence fragments. Over time I might expand on these thoughts as they land here on my blog.

2023/08/07: DoorDash’s generative AI assistance

I’m a little late in sharing this one (it’s from a couple weeks back) but it’s still an interesting use of generative AI.

This new DoorDash system blends search and recommendations to help people as they sort out what to order and from where.

I give DoorDash bonus points for treating this as a limited experiment: they’re still in a testing phase, and the system warns end-users that results might be iffy at times. (That’s much better than, say, rolling out the system to everyone at once and making sweeping declarations about functionality before it’s been tested.)

DoorDash Is Working on an AI Chatbot to Speed Up Food Ordering” (Bloomberg)

2023/08/08: Facial recognition failure

We have yet another failure of facial recognition technology:

Eight Months Pregnant and Arrested After False Facial Recognition Match” (NYT)

I’m not surprised by the model error here. All AI models fail at some point, and facial recognition has an especially poor track record.

What surprises (appalls) me is that:

1/ In light of the poor track record, companies keep buying facial recognition tools.

2/ Companies keep taking the models’ outputs as gospel.

Both of these steps are terribly irresponsible.

Even if you’re not deploying facial recognition tools, what can you learn from this incident? How can you keep your company’s AI on the right track?

1/ When someone is pitching an AI-driven tool to you, ask (and follow up on) questions such as: “How, specifically, did you build this training dataset? What metrics did you use to evaluate the models? How did you analyze the errors that cropped up during training?”

(Bonus points: don’t just take “accuracy” for an answer. Accuracy is one of many metrics you can use to evaluate a model. It’s not always the best metric. And for extremely sensitive situations, numbers that sound very high – “oh, it’s 98% accurate” – are in fact quite low.)

2/ Be mindful of how much weight the model gets in your decisions. The higher the stakes, the more you need to treat the model’s output as a mere suggestion. You really need people to think through what the model is telling them, and to override or disregard the model when additional context shows that it is clearly incorrect.

3/ Read the vendor’s contract (or TOS, or whatever) very carefully. When the model is wrong, who is ultimately held responsible? (That is: what percentage of the “model risk” falls on your plate? The answer is usually “100%,” which is why points #1 and #2 are so important.)

2023/08/09: Scammers get creative with technology … again

It’s no surprise that scammers are pushing a flood of AI-generated nonsense. It’s similar to the unit economics of sending spam: when it costs you effectively nothing to send an e-mail, just about any response rate equals a profit. So why not send tens of millions of messages, or set up a ton of fake websites, in the hopes of landing a fractional percentage point of clicks?

This new scam takes that thinking to another level: mix AI-generated text, generated profile pics, soundalike author names, and print-on-demand services. Sprinkle in some fake reviews for good measure.

A New Frontier for Travel Scammers: A.I.-Generated Guidebooks” (NY Times)

I don’t condone their actions, but I have to acknowledge that criminals have long been technology innovators. Expect to see more “creative” uses of generative AI over the coming years.

2023/08/10: New blog post on a grocery store’s AI bot

You may have seen a recent Guardian article about a grocery store’s AI bot. It generated unappetizing, even poisonous recipes.

Sure, you can laugh about it. You can also consider what lessons we can learn from this incident. Because today, this was a problem in a grocery store. Tomorrow, it could be your company’s AI making the news.

I started to summarize my thoughts here but realized that would require more space than a LinkedIn post should take. So I’ve written a blog post, instead:

Grocery bots and chlorine cocktails

2023/08/11: Red-teaming the AI bots

Red-teaming exercises (simulated adversarial interactions) are key when deploying public-facing systems. By occasionally pretending to be a bad actor, you can get ahead of some of the real bad actors out in the field.

This is a common practice in IT security. And now I’m happy to see that more people are red-teaming against generative AI chatbots (aka large language models, aka LLMs):

Meet the hackers who are trying to make AI go rogue” (Washington Post)

(This is an approach I called out a couple months ago, in a Radar piece “Risk Management for AI Chatbots” . If you’d like more thoughts on AI risk management and AI safety, feel free to follow me here on LinkedIn or check out my blog at https://qethanm.cc/news .)