Undervalued Practices in ML/AI, Part 1: Getting Started

Posted by Q McCallum on 2021-03-22

This post part of a series on undervalued practices in ML/AI:

This post is one in a series on undervalued practices in ML/AI:

  • Introduction
  • Part 1: Getting started with ML/AI (this post)
  • Part 2: Hiring and Team Structure
  • Part 3: Planning projects
  • Part 4: Project Execution

1 - Develop a plan.

That is, formulate a data strategy or road map that outlines what projects you’ll execute, whom you’ll hire, and when you will (or will not) take those steps. You’re less likely to stumble if you know the road ahead.

Plenty of companies have an industry reputation for their AI work, so you might be tempted to copy whatever they have done instead of developing your own data strategy. Unless you’re solving those other companies’ exact same problems, under the exact same market conditions, and with the exact same data, it’s unlikely that you’ll be able to succeed in the exact same way.

When you develop your road map, you’d do well to focus on you by making plans specific to your company, your needs, your challenges, and your opportunities.

2 - Talk to someone who knows.

Don’t develop that data strategy in isolation. Work with someone who has deep knowledge of both ML/AI and business models to translate your company’s goals and challenges into use cases.

You can certainly talk to me about this. Or you can find a different data strategy consultant. You can even reach out to a trusted, experienced data scientist colleague in your network. Whomever you ask, this person should come with questions, open ears, and an open mind. They’ll need to hear your story and explore your company in order to help you achieve success.

(Given that, beware anyone who offers a prepackaged, one-size-fits-all solution that involves more of them pitching their story than hearing yours.)

You should leave those conversations with a realistic picture of what is possible in ML/AI, and an understanding of whether this is even the proper fit for your company right now. It may be a let-down to hear that your company should wait a year or two before getting serious with ML/AI, yes. That’s still better than losing money by investing too early.

3 - Develop that data strategy before you hire your first data scientist.

You want your first data hire to hit the ground running in the short term, and to prepare them for as many wins as possible. You’ll boost their chances of success – ergo, your chances of success – by handing them a realistic road map on their first day.

Some companies try to cut corners by hiring the data scientist and tasking them with developing the data strategy. How can you tell whether you even need a data scientist, much less what skills they should possess, until you have … developed that data strategy? You may call this a real chicken-and-egg problem. I prefer to describe it as “almost always a terrible idea.”

(Occasionally, companies will even try to use data scientist job interviews as a way to develop their data strategy. Not only is this unprofessional and unethical, it’s also unlikely to yield meaningful results.)

People sometimes resist developing a data strategy, because they “have a hunch” or “just know that ML/AI is a fit.” I definitely agree that you should trust your gut … so long as it’s not a substitute for making a formal plan.

4 - Spread data literacy throughout the company.

ML/AI does not live in isolation; it has the potential to touch every aspect of your company. It can lead to amazing success but also reputation-ruining failure.

All of this means that your data scientists cannot be the only people in the company who understand how this all works. Any person who has influence on what your company does, or what services it builds, must have a clear understanding of what is realistic for ML/AI. This most certainly includes the executive team and the product team.

There’s no need for the CEO to take classes such that they can create a working nerual network. (Though, admittedly, that would be interesting.) It does mean that when the data scientists are explaining a model’s performance, or the need for more training data, or that a model isn’t performing as expected, that everyone in that meeting should understand what they mean. No one should hand-wave away any concerns around what the model can achieve, or ethical matters of sourcing data, or anything else.