Table of Contents
Learn how to train a model and how to give it prediction capability using Core ML and Create ML in SwiftUI.
Believe it or not, research into artificial intelligence, or AI, goes way back to the 1950s, but it wasn’t until the late 1990s that it started to show its value by finding specific solutions to specific problems.
Machine learning, or ML, is one of the important fields of AI and primarily focuses on understanding and building methods that learn. It tries to build a model based on training data so it can make decisions or predictions without someone having programmed it to do so.
ML has two main objectives: classification and prediction.
- Classification classifies currently available data and makes decisions based on the developed models.
- Prediction makes forecasts of future outcomes.
In Apple platforms, Core ML and Create ML are the main frameworks for machine learning.
- Core ML lets you train a model based on the training data, and you can use the produced model in your apps on most Apple platforms.
- Create ML, introduced in iOS 15, provides you with a means to create a Core ML model inside your app on iOS, macOS, iPadOS, and Mac Catalyst.
In this tutorial, you’ll develop an app called Tshirtinder — an app designed to match you to the perfect t-shirt. As its name suggests, it shows you a t-shirt, then you express your interest — or lack thereof — with Tinder-style gestures of swiping right or left.
After each swipe, the app shows a selection of t-shirts it thinks would interest you. As the app learns your t-shirt preferences, the recommendations become more relevant.
Before you get to the fun part of judging t-shirts, you’ll satisfy these learning objectives:
- How to use Create ML to integrate AI within an app.
- Create and train a model.
- Build out predictive capabilities.
Getting Started
Download the starter project by clicking on the Download Materials button at the top or bottom of the tutorial.
Open TShirtinder.xcodeproj, then build and run it on your device.
Take a moment to play with the app. All the code to support core features, such as Tinder-style swipe animation, are already there for you to enjoy.
Note: You’ll need a real device to see all the functionalities working, because Create ML and Core ML aren’t available on the simulator. You could use the Mac (Designed for iPad) run destination if you’re on a Mac with an Apple M1 or better processor.
Regression vs. Classification
Regression predictive modeling problems are different from those of classification predictive modeling — in essence:
- Regression predicts a continuous quantity.
- Classification predicts a discrete class label.
Some overlaps exist between regression and classification:
- A regression algorithm may predict a discrete value if it’s in the form of an integer quantity.
- A classification algorithm may be in the form of a probability for a class label. If so, it may predict a continuous value.
With these in mind, you can use any of these modelings for your Tshirtinder. Yet, looking at the algorithms available in Create ML, a linear regression seems like a good fit.
What is Linear Regression?
Linear regression is a well-known algorithm in statistics and machine learning.
It’s a model that assumes a linear relationship between the input variables x and the single output variable y. It will calculate y from a linear combination of the input variables x.
In ML terms, people sometimes call input variables features. A feature is an individual measurable property or characteristic of a phenomenon.
Open shirts.json. As you see, all the t-shirts the app can show are in this file. For each t-shirt, there are features such as sleeve type, color, and neck type.
"title": "Non-Plain Polo Short-Sleeve White",
"image_name": "white-short-graphic-polo",
"color": "white",
"sleeve": "short",
"design": "non-plain",
"neck": "polo"
You can’t consider all the properties in each instance as features. For instance, the title
or image_name
isn’t suitable for showing the characteristics of a t-shirt — you can’t use them to predict the output.
Imagine you want to predict a value for a set of data with a single feature. You could visualize the data as such:
Linear regression tries to fit a line through the data.
Then you use it to predict an estimated output for an unseen input. Assuming you have a model with two features, a two-dimensional plane will fit through the data.
To generalize this idea, imagine that you have a model with n features, so an (n-1) dimensional plane will be the regressor.
Consider the equation below:
Y = a + b * X
Where X
is the explanatory variable and Y
is the dependent variable. The slope of the line is b
, and a
is the intercept — the value of Y
when X
equals 0.
That’s enough theory for now.
How about you get your hands dirty and let technology help you get some new threads?
Preparing Data for Training
First, have a look at the methods you’ll work with and get to know how they work.
Open MainViewModel.swift and look at loadAllShirts()
.
This method asynchronously fetches all the shirts from shirts.json then stores them as a property of type FavoriteWrapper
in MainViewModel
. This wrapper adds a property to store the favorite status of each item, but the value is nil
when there’s no information about the user’s preferences.
Now examine the other method — where most of the “magic” happens: didRemove(_:isLiked:)
. You call this method each time a user swipes an item.
The isLiked
parameter tracks if the user liked a specific item or not.
This method first removes the item from shirts
then updates the isFavorite
field of the item in allShirts
.
The shirts
property holds all the items the user hasn’t yet acted on. Here’s when the ML part of the app comes in: You’ll compute recommended shirts anytime the user swipes left or right on a given t-shirt.
RecommendationStore
handles the process of computing recommendations — it’ll train the model based on updated user inputs then suggest items the user might like.
Computing Recommendations
First, add an instance property to MainViewModel
to hold and track the task of computing t-shirt recommendations to the user:
private var recommendationsTask: Task<Void, Never>?
If this were a real app, you’d probably want the output of the task and you’d also need some error handling. But this is a tutorial, so the generic types of Void
and Never
will do.
Next, add these lines at the end of didRemove(_:isLiked:)
:
// 1
recommendationsTask?.cancel()
// 2
recommendationsTask = Task
do
// 3
let result = try await recommendationStore.computeRecommendations(basedOn: allShirts)
// 4
if !Task.isCancelled
recommendations = result
catch
// 5
print(error.localizedDescription)
When the user swipes, didRemove(_:isLiked:)
is called and the following happens:
- Cancel any ongoing computation task since the user may swipe quickly.
- Store the task inside the property you just created — step 1 exemplifies why you need this.
- Ask
recommendationStore
to compute recommendations based on all the shirts. As you saw before,allShirts
is of the typeFavoriteWrapper
and holds theisFavorite
status of shirts. Disregard the compiler error — you’ll address its complaint soon. - Check for the canceled task, because by the time the
result
is ready, you might have canceled it. You check for that incident here so you don’t show stale data. If the task is still active, set the result torecommendations
published property. The view is watching this property and updates it accordingly. - Computing recommendations throws an
async
function. If it fails, print an error log to the console.
Now open RecommendationStore.swift. Inside RecommendationStore
, create this method:
func computeRecommendations(basedOn items: [FavoriteWrapper<Shirt>]) async throws -> [Shirt]
return []
This is the signature you used earlier in MainViewModel
. For now, you return an empty array to silence the compiler.
Using TabularData for Training
Apple introduced a new framework in iOS 15 called TabularData. By utilizing this framework, you can import, organize and prepare a table of data to train a machine learning model.
Add the following to the top of RecommendationStore.swift:
import TabularData
Now create a method inside RecommendationStore
:
private func dataFrame(for data: [FavoriteWrapper<Shirt>]) -> DataFrame
// Coming soon
The return type is DataFrame
, a collection that arranges data in rows and columns. It is the base structure for your entry point into the TabularData framework.
You have options for handling the training data. In the next step, you’ll import it. But you could also use a CSV or JSON file that includes the provided initializers on DataFrame
.
Replace the comment inside the method you created with the following:
// 1
var dataFrame = DataFrame()
// 2
dataFrame.append(column: Column(
name: "color",
contents: data.map(\.model.color.rawValue))
)
// 3
dataFrame.append(column: Column(
name: "design",
contents: data.map(\.model.design.rawValue))
)
dataFrame.append(column: Column(
name: "neck",
contents: data.map(\.model.neck.rawValue))
)
dataFrame.append(column: Column(
name: "sleeve",
contents: data.map(\.model.sleeve.rawValue))
)
// 4
dataFrame.append(column: Column<Int>(
name: "favorite",
contents: data.map
if let isFavorite = $0.isFavorite
return isFavorite ? 1 : -1
else
return 0
)
)
// 5
return dataFrame
Here is a step-by-step description of the above code:
- Initialize an empty
DataFrame
. - Arrange the data into columns and rows. Each column has a
name
. Create a column for thecolor
then fill it with all the data that’s been reduced to onlycolor
usingmap
and a keypath. - Append other columns to the data frame that are suitable for prediction:
design
,neck
andsleeve
. Bear in mind that the item count inside each column needs to be the same; otherwise, you’ll have a runtime crash. - Append another column to record
favorite
status of each item. If the value is notnil
and it’strue
then add a 1. But, if it’sfalse
then add a -1. If the value isnil
add a 0 to indicate the user hasn’t made a decision about it. This step uses numbers — not Booleans — so you can apply a regression algorithm later. - Return the data frame.
Note: At the time of writing, Create ML methods don’t offer asynchronous implementations. It is possible, of course, to use the old and familiar Grand Central Dispatch, or GCD.
Now, add an instance property to the class to hold a reference to a DispatchQueue
:
private let queue = DispatchQueue(
label: "com.recommendation-service.queue",
qos: .userInitiated)
Label it whatever you want. The qos
parameter stands for Quality of Service. It determines the priority at which the system schedules the task for execution.
Now, it’s time to get back to computeRecommendations(basedOn:)
.
This function is an async
method and needs to be converted to a GCD async task to work with Swift’s async
functions.
Replace the return
statement inside the method’s implementation with:
return try await withCheckedThrowingContinuation continuation in
// Coming soon
The withCheckedThrowingContinuation
closure suspends the current task then calls the given closure with continuation. A continuation is a mechanism to interface between synchronous and asynchronous code.
Inside this closure, call async
on the queue
you defined earlier:
queue.async
// Don't be hasty
When your result is ready inside the closure of the GCD queue, you call resume(returning:)
on the continuation
parameter. If any error occurs inside this queue then you call resume(throwing:)
.
The system will convert those calls into the async throws
signature of Swift’s concurrency system.
From now on, all the code you’ll write will be inside the GCD’s async method you wrote.
Add a target check to throw an error on the simulator.
#if targetEnvironment(simulator)
continuation.resume(
throwing: NSError(
domain: "Simulator Not Supported",
code: -1
)
)
#else
// Write the next code snippets here
#endif
Add a variable to hold the training data inside the #else
block:
let trainingData = items.filter
$0.isFavorite != nil
OK, so now you have a place to hold training data, but what exactly is this data? According to the definition you just created, the trainingData
constant will include all the items where the user has taken an action.
- Training Data: The sample of data you use to fit the model.
- Validation Data: The sample of data held back from training your model. Its purpose is to give an estimate of model skill while tuning the model’s parameters.
- Test Data: The sample of data you use to assess the created model.
Below your previous code, create a data frame using the trainingData
constant and dataFrame(for:)
, which you created earlier.
let trainingDataFrame = self.dataFrame(for: trainingData)
Here you tell the recommendation system to infer the results based on all the items, whether the user acted on them or not.
Finally, add the following:
let testData = items
let testDataFrame = self.dataFrame(for: testData)
This creates the constants for your test data.
The training and test datasets are ready.
Predicting T-shirt Tastes
Now that your data is in order, you get to incorporate an algorithm to actually do the prediction. Say hello to MLLinearRegressor
! :]
Implementing Regression
First, add the import directive to the top of the file as follows:
#if canImport(CreateML)
import CreateML
#endif
You conditionally import CreateML
because this framework isn’t available on the simulator.
Next, immediately after your code to create the test data constants, create a regressor with the training data:
do
// 1
let regressor = try MLLinearRegressor(
trainingData: trainingDataFrame,
targetColumn: "favorite")
catch
// 2
continuation.resume(throwing: error)
Here’s what the code does:
- Create a regressor to estimate the
favorite
target column as a linear function of the properties in thetrainingDataFrame
. - If any errors happen, you resume the
continuation
using the error. Don’t forget that you’re still inside thewithCheckedThrowingContinuation(function:_:)
closure.
You may ask what happened to the validation data.
If you jump to the definition of the MLLinearRegressor
initializer, you’ll see this:
public init(
trainingData: DataFrame,
targetColumn: String,
featureColumns: [String]? = nil,
parameters: MLLinearRegressor.ModelParameters =
ModelParameters(
validation: .split(strategy: .automatic)
)
) throws
Two default parameters exist for featureColumns
and parameters
.
You set featureColumns
to nil
, so the regressor will use all columns apart from the specified targetColumn
to create the model.
The default value for parameters
implies the regressor splits the training data and uses some of it for verification purposes. You can tune this parameter based on your needs.
Beneath where you defined the regressor
, add this:
let predictionsColumn = (try regressor.predictions(from: testDataFrame))
.compactMap value in
value as? Double
You first call predictions(from:)
on testDataFrame
, and the result is a type-erased AnyColumn
. Since you specified the targetColumn
— remember that is the favorite
column — to be a numeric value you cast it to Double
using compactMap(_:)
.
Good work! You’ve successful built the model and implemented the regression algorithm.
Showing Recommended T-shirts
In this section, you’ll sort the predicted results and show the first 10 items as the recommended t-shirts.
Immediately below your previous code, add this:
let sorted = zip(testData, predictionsColumn) // 1
.sorted lhs, rhs -> Bool in // 2
lhs.1 > rhs.1
.filter // 3
$0.1 > 0
.prefix(10) // 4
Here’s a step-by-step breakdown of this code:
- Use
zip(_:_:)
to create a sequence of pairs built out of two underlying sequences:testData
andpredictionsColumn
. - Sort the newly created sequence based on the second parameter of the pair, aka the prediction value.
- Next, only keep the items for which the prediction value is positive. If you remember, the value of 1 for the
favorite
column means the user liked that specific t-shirt — 0 means undecided and -1 means disliked. - You only keep the first 10 items but you could set it to show more or less. 10 is an arbitrary number.
Once you’ve got the first 10 recommended items, the next step is to add code to unzip and return instances of Shirt
. Below the previous code, add the following:
let result = sorted.map(\.0.model)
continuation.resume(returning: result)
This code gets the first item of the pair using \.0
, gets the model
from FavoriteWrapper
then resumes the continuation
with the result
.
You’ve come a long way!
The completed implementation for computeRecommendations(basedOn:)
should look like this:
func computeRecommendations(basedOn items: [FavoriteWrapper<Shirt>]) async throws -> [Shirt] {
return try await withCheckedThrowingContinuation continuation in
queue.async
#if targetEnvironment(simulator)
continuation.resume(
throwing: NSError(
domain: "Simulator Not Supported",
code: -1
)
)
#else
let trainingData = items.filter
$0.isFavorite != nil
let trainingDataFrame = self.dataFrame(for: trainingData)
let testData = items
let testDataFrame = self.dataFrame(for: testData)
do
let regressor = try MLLinearRegressor(
trainingData: trainingDataFrame,
targetColumn: "favorite"
)
let predictionsColumn = (try regressor.predictions(from: testDataFrame))
.compactMap value in
value as? Double
let sorted = zip(testData, predictionsColumn)
.sorted lhs, rhs -> Bool in
lhs.1 > rhs.1
.filter
$0.1 > 0
.prefix(10)
let result = sorted.map(\.0.model)
continuation.resume(returning: result)
catch
continuation.resume(throwing: error)
#endif
}
Build and run. Try swiping something. You’ll see the recommendations row update each time you swipe left or right.
Where to Go From Here?
Click the Download Materials button at the top or bottom of this tutorial to download the final project for this tutorial.
In this tutorial, you learned:
- A little of Create ML’s capabilities.
- How to build and train a machine learning model.
- How to use your model to make predictions based on user actions.
Machine learning is changing the way the world works, and it goes far beyond helping you pick the perfect t-shirt!
Most apps and services use ML to curate your feeds, make suggestions, and learn how to improve your experience. And it is capable of so much more — the concepts and applications in the ML world are broad.
ML has made today’s apps far smarter than the apps that delighted us in the early days of smartphones. It wasn’t always this easy to implement though — investments in data science, ultra-fast cloud computing, cheaper and faster storage, and an abundance of fresh data thanks to all these smartphones have allowed this world-changing technology to be democratized over the last decade.
Create ML is a shining example of how far this tech has come.
People spend years in universities to become professionals. But you can learn a lot about it without leaving your home. And you can put it to use in your app without having to first become an expert.
To explore the framework you just used, see Create ML Tutorial: Getting Started.
For a more immersive experience ML for mobile app developers, see our book Machine Learning by Tutorials.
You could also dive into ML by taking Supervised Machine Learning: Regression and Classification on Coursera. The instructor, Andrew Ng, is a Stanford professor and renowned by the ML community.
For ML on Apple platforms, you can always consult the documentation for Core ML and Create ML.
Moreover, Apple provides a huge number of videos on the subject. Watch some video sessions from Build dynamic iOS apps with the Create ML framework from WWDC 21 and What’s new in Create ML from WWDC 22.
Do you have any questions or comments? If so, please join the discussion in the forums below.
More Stories
Natural Tactics to Overcome Depression Without Medication
Celebrating Milestones Forward Can Yield Amazing Results
Radar Trends to Watch: July 2022 – O’Reilly