NEW! Case Study: 81% Conversion Boost for Wizz
ContextSDK logo in purple
Products
ContextDecisionContextPush
Solutions
Solutions
Teams
DevelopersHeads of ProductMarketers
Industries
GamingEntertainmentHealthSocial MediaDating
Use cases
Dynamic Product ExperiencePush Notification Open Rate
Teams
DevelopersHeads of ProductMarketers
Industries
GamingEntertainmentHealthSocial MediaDating
Use Cases
Dynamic Product ExperiencePush Notification Open Rate
Resources
Resources
Case StudyNewsletterBlogDemo AppDocs
Company
ContactCareersPressPrivacy & Security
LoginContact us
Blog
/
Engineering

How to train your first machine learning model and run it inside your iOS app via CoreML

Felix Krause
May 6, 2024

Introduction

‍

Machine Learning (ML) in the context of mobile apps is a vast field, encompassing various implementations and requirements. At the highest level, it can be categorized into:

  1. Running ML models on server infrastructure and accessing them from your app through API requests
  2. Running ML models on-device within your app (we will focus on this)
  3. Fine-tuning pre-trained ML models on-device based on user behavior
  4. Training new ML models on-device

‍

In this series, we will focus on variant 2: starting by training a new ML model on your server infrastructure using real-life data, and then distributing and using that model within your app. Thanks to Apple’s CoreML technology, this process is highly efficient and streamlined.

‍

We have crafted this guide for all developers, regardless of their prior data science or backend experience.

‍

Step 1: Collecting the data to train your first ML model

‍

To train your first machine learning model, you need data to train on. For our example, we aim to optimize when to show certain in-app prompts or messages in iOS apps.

‍

Let’s assume your data looks like this:

  • Outcome describes the result of the user interaction, such as whether they purchased an optional premium upgrade
  • Battery Level indicates the user’s current battery level as a float
  • Phone Charging notifies if the phone is currently plugged in as a boolean

‍

In the above example, the “label” of the dataset is the outcome. In machine learning, a label for training data refers to the output or result for a specific instance in a dataset. The label is used to train a supervised model, helping it learn to classify new, unseen examples or predict outcomes.

‍

How you collect the data to train your model is your choice. In our scenario, we’d gather non-PII data similar to the example above to train models based on real-world user behavior. We’ve developed our own backend infrastructure to facilitate this, detailed in our Blog:

  • Building the Infrastructure to Ingest 40m Context Events per Day
  • Unifying Data Models Across a Heterogeneous Stack

‍

Step 2: Load and prepare your data

‍

There are different technologies available to train your ML model. In our case, we use Python along with pandas and sklearn.

‍

Load the recorded data into a pandas DataFrame:

Instead of using hard-coded data like above, you would access your database with the real-world data you’ve already collected.

‍

Step 3: Split the data between training and test data

‍

To train a machine learning model, you need to split your data into a training set and a test set. We won’t dive into why this is necessary, as there are numerous excellent resources explaining it, like this insightful CGP Video.

The code above splits your data by a ratio of 0.2 (⅕) and separates the X and the Y axis, which means separating the label (“Outcome”) from the rest of the data columns.

‍

Step 4: Start Model Training

‍

In this step, you must choose a classifier to use. For our example, we'll employ a basic RandomForest classifier:

The output of the training provides a classification report, which in simplified terms, tells you how accurate the trained model is.

In the screenshot above, we’re only using test data as part of this blog series. If you’re interested in interpreting and evaluating the classification report, check out this guide).

‍

Step 5: Export your model into a CoreML file

‍

Apple’s official CoreMLTools make it straightforward to export the classifier (in this case, our Random Forest) into a .mlmodel (CoreML) file, which we can run on Apple’s native ML chips. CoreMLTools support a range of classifiers, but not all, so it’s important to verify its compatibility first.

‍

Step 6: Bundle the CoreML file with your app

‍

For now, simply drag & drop the CoreML file into your Xcode project. In a future blog post, we will elaborate on deploying new ML models over-the-air.

‍

Once added to your project, you can inspect the inputs, labels, and other model details directly within Xcode.

‍

Step 7: Executing your Machine Learning model on-device

‍

Xcode will automatically generate a new Swift class based on your .mlmodel file, including details about the inputs and outputs.

In the code example provided, we input parameters such as battery level and charging status using an indexed array. This method does not use explicit string labels but provides enhanced performance if you have numerous inputs.

‍

Alternatively, during the model training and export process, you can opt for a String-based input for your CoreML file if desired.

‍

We will further discuss how to optimize your iOS app to leverage both methods while supporting over-the-air updates, dynamic inputs based on new models, and effectively handling errors, processing responses, managing complex A/B tests, safe rollouts, etc.

‍

Conclusion

‍

In this guide, we outlined the process from data collection to training a Machine Learning model and running it on-device to make decisions within your app. With Python’s rich library ecosystem, including Apple’s CoreMLTools, it’s straightforward to embark on your first ML model. Thanks to Xcode’s native support for CoreML files and on-device execution, developers can benefit from the robust features of Apple’s platform, such as model inspection within Xcode, strong typing, and reliable error handling.

‍

Within your organization, a Data Scientist will likely be responsible for training, fine-tuning, and providing the model. In this guide, we presented a simple example. However, ContextSDK takes into account over 180 different signals from diverse types, patterns, and sources to deliver optimal results, ensuring models remain compact and efficient.

‍

In the coming weeks, we will publish a second post on this subject, demonstrating how you can deploy new CoreML files to millions of iOS devices over-the-air, in a secure and cost-effective manner, while managing intricate A/B tests, dynamic input parameters, and more.

‍

Engineering
How to train your first machine learning model and run it inside your iOS app via CoreML
May 6, 2024
Engineering
Safely distribute new Machine Learning models to millions of iPhones over-the-air
May 22, 2024
Engineering
Building the Infrastructure to Ingest 40m Context Events per Day
January 24, 2024
Engineering
Detecting the user's context on Apple Vision Pro
February 21, 2024
Engineering
Automatically build & distribute custom iOS SDK Binaries for each customer
February 6, 2024
Blog Home
AllProductGrowthMarketEngineering

Subscribe to our newsletter, Contextualize this!

Welcome aboard!

Get ready to explore how real-world context can transform apps. Stay tuned for our upcoming issues!
Oops! Something went wrong while submitting the form.
LoginContact us
Leveraging real‒world user context to supercharge engagement and revenue since 2023.
Founded by Felix Krause and Dieter Rappold.
ContextDecisionContextPushSolutionsProductsDemo App
CompanyContactCareers
Privacy & SecurityPrivacy PolicyImprint
SOC II Type 2GDPR Compliant
© ContextSDK Inc. 169 Madison Avenue, STE 2895 New York, NY 10016 United States
support@contextsdk.com