MailChimp Not Quite Ready for PrimeTimeML

Home / Blogs

MailChimp Not Quite Ready for PrimeTimeML

	By Fred Tabsharani Founder and CEO at Loxz Digital Group
	September 03, 2021 Views: 8,735 Add Comment

With perhaps the most coveted valuation in the Email Industry at close to $10B, MailChimp is considered the most forward-thinking ESP on the planet boasting 12M customers, with outstanding brand recognition and an incredible leadership suite. But when it comes to installing RealTimeML, it’s lollygagging mainly because it has not justified the actual value to productionalize RealTimeML across its client base. And also, because it is a challenge to execute!

Last week we approached the CEO of MailChimp, Mr. Ben Chestnut, regarding the business impact a RealTimeML recommendation engine can deliver to their ESP in terms of revenue. To our delight, he immediately referred us to their Chief Data Science Officer, Mr. David Dewey.

Now those in the know will appreciate and underscore the value of a 16-year relationship we’ve had with members of the MailChimp organization. During my tenure as Head of Global Marketing at Port25, they had been our most treasured client. Readers should also know how genuine the relationships have been over the years hosting MailChimp at in-person events, commending MailChimp for their pioneering mindset, and establishing eternal bonds with the elite in the organization and where it matters most with people like Brandon Fouts, Director of Engineering at MailChimp among many others. Further, I have personally witnessed the evolution of MailChimp and the entire ESP landscape, which I have covered over the last dozen years.

As advocates of the ESP marketplace, we once again arrive at an integral time of how Email Service Providers must reinvent themselves to effectively and efficiently manage this RealTimeML production boom to stay competitive. If they procrastinate and decide not to choose RealTimeML within the campaign building process, they will risk losing significant market share and, in the words of my professor at Cal Berkeley, Mr. Alberto Todeschini, PhD, perhaps become “obsolete”.

And the story begins:

Immediately after, Ben introduced me to David Dewey, The Chief Data Science Officer, and without considering the 15-year relationship, or even a gratuitous introduction, he handed my head on a silver platter related to building out a RealTime Recommendation Engine at MailChimp. We immediately sensed frustration from David Dewey relating to the challenging model deployment terrain ahead of MailChimp and other leading ESPs as there is a significant amount of technical debt that they may have acquired and need to overcome.

Here was his response (and keep in mind I have never met David Dewey): Hey Fred - I have to be honest; I’m pretty surprised to be hearing from you given this article you wrote here. I want to share back with you a snippet from something I wrote to my team in response:

“The thing that stood out from Fred’s article is how nonchalantly he just throws out, “Build this model,” or “it’s just as simple as…” This attitude has become endemic amongst “AI Visionaries.” They tend to completely ignore the operational aspects, the UX implications, the accuracy requirements, and the continuous improvement challenges associated with the development and productionalization of real, actual, working models. When we release a model, it has to work for 12 million users. We have to know the likelihood of it returning false positives/negatives and adjust the UX accordingly. We have to be careful to protect the Mailchimp brand and commitment to DE&I—something that many companies have faltered on with their AI models. We don’t treat anything as “it’s just as simple as…,” or bad things happen. I really don’t see a path forward for your product at Mailchimp. Being insultingis not a way to get your foot in the door, and your article makes it pretty clear that you don’t have enough experience building models at scale to be helpful for a user base as large as Mailchimp’s.”

There are some bright spots in his response, but you can immediately sense the frustration and the enormous challenges Mr. Dewey is presented with. Anyone that has even thought about deploying a productionalized model that serves up outputs in milliseconds across a user base as significant as MailChimp’s, with little to no skew, should understand the pitfalls and challenges. Before embarking further, we want you to understand the business value this can provide to MailChimp and its customer base.

Business Value

We sketched out and calculated the business impact of RealTimeML in the MailChimp environment. It is significant. Here is a back of the napkin breakdown; if productionalized ML were deployed across 12M clients, albeit in a staggered approach using HLP and transitioning to proven deployment patterns to start and extensive usability testing, this is what we came up with:

Assuming a 1% activation rate of ( 1% of 12M) clients, that would generate revenues equivalent to onboarding about 80K new clients per month, given that new customers onboard at an average of 15.00 per month starting point. This is assuming a standard distribution agreement of 15%. If that distribution agreement were to be 25%, that would equate to the equivalent of more than 120Knew clients per month for MailChimp. We also understand that creating business value for your existing clients is much less expensive operationally than marketing to new clients and, more importantly, the satisfaction of existing clients would have by turning on such an ML assistant.

Something to think about if you are an ESP wanting to create additional value for your existing clients. Since everyone is sending emails now, providing value creation in a total addressable market that might be showing a tiny bit of saturation should ring true to company stakeholders. I believe this was the impetus that led to the graceful introduction of the now infamous Mr. David Dewey by Ben Chestnut.

Ben understands the urgency of RealTimeML in an ESP environment, and I’m sure David does as well, but the clouded nature and procrastination of at least introducing a prototype is not yet evident in their roadmap. Perhaps this article will circumvent the lollygagging going on at MailChimp and further magnify the unrelenting opportunity that exists. Now let’s get into the nuts and bolts of how we can potentially deploy a recommendation engine in an ESP environment without all the drama.

Deployment

If I were David Dewey, I’d start by asking myself, do we need RealTimeML, and why is this required? First, I am not a data scientist or purport to be one, nor am I an MLOps deployment specialist, although I have interviewed hundreds and employ about a dozen data scientists, many of who are Ph.Ds and have extensive experience working with deep learning projects. Of the five models we have built about email, all in the POC phase, all have been predictive analysis models, and one has been a deep learning model using a convolutional neural network.

When scoping out a modeling project for a client (in this case, the customers of MailChimp), the CDO must continually find ways to add value to its current client base.

As users become more interested in predictive analytics, and as the client base matures using data, predictive capabilities are required to ensure relevant messaging.

By implementing RealTimeML, the client wins by understanding what potential predictive metrics can be achieved in milliseconds after the input. Waiting for the subscriber to open the email and identify what links they’ve clicked no longer serves the primary purpose. Predictive RealTimeML metrics also provide a dopamine effect for the campaign builder. That is the why.

Now the How? If you are offering a new product or service to a client base, I would enlist the best minds on the planet who have experience in productionalizing RealTimeML. “Think Airbnb” - Here are a few names Mr. Dewey may call upon and assist in this endeavor: Mr Andrew Ng, Chip Huyen and Robert Crowe, among many others.

Each of these talented and experienced people would act as my sounding board as consultants to begin formulating an idea of a pilot project to be rolled out to a few clients at most. Say 100 or 1000.

We would need to start with at least 150K rows of data from MailChimp’s historical datasets and enlist data scientists to begin working on a prototype model. We can immediately double the size of that dataset, assuming they were images by augmenting the data for the test model. Once expanded, the dataset would result in 300K images, and the extended data can be used in the style module of the content editor. 300K images, once properly labeled, either using Human-Labeling, weak supervision, or any other form of efficient labeling process gives us a decent size dataset to begin testing our neural network model.

To deploy this model successfully, we must consider proven deployment patterns, so when considering a new model, keep the end of the lifecycle in mind, understanding what deployment patterns to use to confirm the models’ effectiveness and ensure an effective rollout.

Given that brand image is inherently crucial to every company, including MailChimp, one way to start the deployment phase is by using a deployment pattern called Shadow mode, where a human makes a judgment on the image and the algorithm shadows the assessment. Shadow mode at the very outset is a very effective way to determine the performance of your ML algorithm in a testing environment before any rollout.

There will be challenges with the 300K dataset in testing, where the human might have detected one pattern while the algorithm detected another. In this phase, the ML system output is not used for any decision-making, and it just runs parallel in a testing environment to mirror human judgement. The following deployment pattern to consider is “canary mode,” where the algorithm is deployed across, say, 1% of the client base, or even less. Here is where human-centric feedback is critical to the success of any RealTimeML consideration. These users will provide the added feedback needed to consider everything from UI tweaks, serving and overall usability of the new project. Any management team would understand that you will not roll out a learning algorithm to all 12M clients. It’s a gradual ramp-up that takes time and is an iterative process. Think IP Warm UP in the olden days, when you needed to get an email delivered to the inbox. The third deployment pattern to consider is the Blue/Green deployment pattern. In this version, you deploy the algorithm in the old version of the software routing the images to the Blue Version of the software, and when a new iteration of the UI is complete or an updated software version has been completed, you immediately route some traffic to the Green version ramping up gradually.

To speak to the degrees of automation, to run RealTimeML successfully, we would potentially start with human making decisions and gradually ramp up to shadow mode, then partially automate. Since this is an iterative process, rollbacks will be commonplace, and extensive error analysis will be needed regularly to achieve higher accuracy rates for the algorithm. Remember that with each input, the system learns and will result in a higher accuracy rate as the system recognizes more data. There are additional and very complex considerations, including but not limited to UI/UX improvements, error analysis, feature engineering, deployment environments to reduce skew from testing to serving, data resiliency and data degradation, concept drift and data drift, data pre-processing, dimensionality reduction and compute resources, to name a few.

For any ESP considering RealTimeML to create value for their existing client base, it is a massive undertaking with MLOps engineers in the middle. Creating accurate models is one thing but deploying these models in a RealTime productionalized environment is another. If there is one area that the Chief of Data Science at MailChimp and I agree to, deployment is a big challenge.

I commend MailChimp for even beginning the dialogue on this, and while having them as a client at this point would be considered a miracle, at least we know that discussions around the concept are now taking place in earnest. If you are an ESP, don’t underestimate the value of RealTimeML as value creation for your clients and, just as significantly, the impact it will have on your bottom line. Our data scientists at Loxz are working on a 5th model for email. And it is a sentiment analysis model, and it relates to what types of sentiment would lead to a higher conversion rate.

NORDVPN DISCOUNT - CircleID x NordVPN
Get NordVPN [74% +3 extra months, from $2.99/month]

By Fred Tabsharani, Founder and CEO at Loxz Digital Group — Fred Tabsharani is Founder and CEO of Loxz Digital Group, A Machine Learning Collective with an 18 member team. He has spent the last 15 years as a globally recognized digital growth leader. He holds an MBA from John F. Kennedy University and has added five AI/ML certifications, two from the UC Berkeley (SOI) Google, and two from IBM. Fred is a 10 year veteran of M3AAWG and an Armenian General Benevolent Union (AGBU) Olympic Basketball Champion.
Visit Page

Filed Under

Comments

The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.