Weekly recap: 2023-02-19

Posted by Q McCallum on 2023-02-19

What you see here is the last week’s worth of links and quips I have shared on LinkedIn, from Monday through Sunday.

For now I’ll post the notes as they appeared on LinkedIn, including hashtags and sentence fragments. Over time I might expand on these thoughts as they land here on my blog.

2023/02/13: Never let the machine run unattended

Well, that didn’t take long: Twitch has suspended the GPT-3-based Seinfeld clone for saying terrible things.

“AI-Generated ‘Seinfeld’ Show ‘Nothing Forever’ Banned on Twitch After Transphobic Standup Bit” (Vice)

(Credit where it’s due: I originally found this in Der Spiegel.)

There’s a surface lesson here about #AI -related #risk : pretty much any generative AI that is trained on a broad dataset has a dark past. And will eventually cross a line. Yes.

But there’s a deeper AI risk lesson. Something I have been pointing out for years. Something any algo trader will tell you. And that lesson is:

You never, NEVER let the machine run unattended.

Attach a monitoring system. Scan the feed for a list of off-limits actions. Institute a delayed feed, so a person can check it. Whatever it takes.

The team claims that this happened because they’d briefly switched to a lower-grade GPT-3 model. And, sure, that may be the technical cause… But you are still ultimately responsible for what your model does.

2023/02/14: Automation eats work

Your periodic reminder:

  1. ML/AI is a form of automation
  2. Automation eats work
  3. If there is work that people don’t want to do, they will happily hand it off to an automated system

So when we see people hand work off to ChatGPT or any other tool, it’s time to ask ourselves: how much do they value that work? And how much should we value the work product we’re asking of them?


2023/02/15: Yet another market for personal data

This article is, among other things, an important #dataethics reminder.

Key excerpt:

The Health Insurance Portability and Accountability Act, known as HIPAA, restricts how hospitals, doctors’ offices and other “covered health entities” share Americans’ health data.

But the law doesn’t protect the same information when it’s sent anywhere else, allowing app makers and other companies to legally share or sell the data however they’d like.


2023/02/15: The data portfolio

What Bill Reynolds has said here aligns with my experience.

Bottom line: Your company’s data catalog represents a mix of assets and liabilities.

  • By default, any given dataset is a liability – because it incurs the direct costs to collect and store it, plus the possible future costs of regulatory risk and PR (reputation) risk.
  • It’s only with a lot of planning, discipline, effort, and upkeep (plus a bit of luck) that it may become an asset – something that generates revenue.

So if you’ve been collecting tons of data “just in case, maybe, someday” … you’re sitting on a lot of risk.

The take-away lesson? Spend some time reviewing your catalog to sort out which datasets are the assets, which are the liabilities, and which ones are “mispriced” as far as how you treat them. Be prepared for surprises.

(And if you aren’t already tracking your datasets, well, this is the time to start.)

For a deeper look at dataset-as-a-risk, I’ll offer up a blog post from last year: “Not All Datasets Are Created Equal”)

#data #risk

2023/02/16: Pay attention to those metrics

I published a short article called “When good metrics are bad.” (Now mirrored on this site.)

2023/02/17: Disabling the autopilot

I published an article, “Is your company on autopilot?” (Now mirrored on this site.)

2023/02/17: Religious leaders explore the impact of AI

Religious leaders ponder #AIEthics and the impacts of #AI on the world.

Key excerpt:

Sheikh bin Bayyah asked who we should hold responsible for the mistakes of artificial intelligence. What will happen to communication between humanity? How will artificial intelligence affect our behaviour? How do we avoid failures? Would we, he asked poetically, like the little silkworm, suffocate inside our own creation?

I’d like to emphasize: not only are these very important questions, but they’re important questions for everyone involved in delivering AI-based services.

Stakeholders, product owners, and data practitioners all need to work through them if we are to reduce our exposure to the downside #risk of using AI, while still leaving ourselves open to all of the upside gain.