Weekly recap: 2023-07-09

Posted by Q McCallum on 2023-07-09

What you see here is the last week’s worth of links and quips I have shared on LinkedIn, from Monday through Sunday.

For now I’ll post the notes as they appeared on LinkedIn, including hashtags and sentence fragments. Over time I might expand on these thoughts as they land here on my blog.

2023/07/03: Sneaking in an explanation of ML/AI

This video briefly tells the story of song-matching app Shazam.

How Shazam IDs Over 23,000 Songs Each Minute | WSJ Tech Behind

While sharing some harsh realities of building a company, it manages to sneak in a conceptual explanation of search technology and ML/AI.

How so? Well, Shazam works by:

  • Converting a bunch of songs into numeric representations and saving them in a database.
  • Converting a new sound (the song coming through your phone’s mic) into a numeric representation.
  • Comparing the new sound’s numeric form to all of those in its database, in the hopes of finding a match.

This may seem very far afield of “predict housing prices” or “classify documents” or “find matching web pages” that you see in search or AI, but it’s all the same thing deep down:

  • Data scientists, ML engineers, and search engineers convert real-world concepts to numerical/mathematical representations.
  • For an ML model: algorithms look for patterns in those numbers, which the model uses to make predictions or classifications.
  • For search: a search engine compares one set of numbers (your query) against a database of web pages to give you the items that are (numerically, statistically) close to it.

That’s it. The core of search and AI is just the two steps of “convert things into numbers” and “let computers compare those numbers.” It’s fascinating to see all of the applications of this concept.

(This is when I should explore why search is one of the unsung heroes of AI, but I’ll save that for another day …)

2023/07/06: New blog post on hiring

My colleague Sierra Henson recently asked me about the impact that off-the-shelf ML/AI models and other tools would have on data science hiring.

I’ve recapped my thoughts in this blog post: “Why do we need data scientists, then?

As a reminder: I’m running a short series on hiring and teams in the data/ML/AI space. What else would you like me to cover? Drop a note in the comments, DM here me, or contact me through my website.

(Examples of past topics include “what to look for in your first data scientist” and “ways to improve your chances of hiring the best candidate.” I also have a couple of posts that’ll round out the series, but I’ll release those a little later.)

2023/07/08: The making of “The End of Truth”

Behind the scenes: Der Spiegel’s cover story on generative AI and “the end of truth” (“Das Ende der Wahrheit”).

Bonus: they share the Midjourney prompts they used to create the images. Note the style and specificity of the wording.

Wie unser Titelbild zum »Ende der Wahrheit« entstanden ist” (Der Spiegel)

2023/07/09: It matters where you get your training data

1/ AI needs tons of training data to find meaningful, generalizable patterns. For now we still need human involvement to generate and/or label that data.

2/ How you get that data matters. Spying on customers? Bad. Buying from a vendor? Good, but then you’re exposed to risks in your data supply chain (such as: vendor shuts down, they get bought out by your rival, or they turn out to have acquired that data through sketchy means). Collecting it yourself? Best.

Tesla is offering to pay people between $18 to $48 an hour to drive its EVs this summer” (Insider)