In this tutorial, you’ll learn how to improve your iOS app thanks to efficient Core Data usage with batch insert, persistent history and derived properties.
I’ve been struggling with batch insert/preloading data into CoreData with Swift 5, iOS 14, Xcode 12. One thing I would like to do is change this code so that it batch inserts from a bundled JSON file, rather than a JSON file out on the web. Any pointers? How might you go about doing this? Any info you can provide is greatly appreciated.
First I create a function in the NSManagedObject Subclass:
import CoreData
import SwiftUI
extension Person {
@NSManaged var name: String?
static func createSingle(name: String?, using viewContext: NSManagedObjectContext) {
let person = Person(context: viewContext)
person.name = name
do {
try viewContext.save()
} catch {
let nserror = error as NSError
fatalError("Unresolved error \(nserror), \(nserror.userInfo)")
}
}
}
Then I put this together in ContentView in order to press a button in order to load some JSON into Core Data, and to give me a count of how many names have been loaded.
import SwiftUI
import CoreData
struct ContentView: View { @Environment(.managedObjectContext) private var viewContext
@FetchRequest(
sortDescriptors: [NSSortDescriptor(keyPath: \Person.name, ascending: true)],
animation: .default) private var people: FetchedResults<Person>
@State var name: String = ""
var body: some View {
let str = "{\"names\": [\"Dave\", \"Tim\", \"Tina\"]}"
let data = Data(str.utf8)
// let str1 = "{\"names\": [\"Bob\"]}"
// let data = Data(str1.utf8)
VStack{
Text("\(people.count)")
Button(action: {
do {
// make sure this JSON is in the format we expect
if let json = try JSONSerialization.jsonObject(with: data, options: []) as? [String: Any] {
// try to read out a string array
if let names = json["names"] as? [String] {
for name in names {
self.name = name
Person.createSingle(name: self.name, using: self.viewContext)
print(name)
}
// name = names[0]
// Person.createSingle(name: self.name, using: self.viewContext)
// print(people)
}
}
} catch let error as NSError {
print("Failed to load: \(error.localizedDescription)")
}
}) {
Text("Add Names")
}
}
}
Hi @dmalicke, Happy new year. If you like you can include a .json file in your app’s bundle and create a class much like the RemoteDataSource class from the article. The main difference will be you get the URL to the file using Bundle.main.url(forResource:withExtension:). Otherwise it’ll be very similar.
@atetlaw Thank you very much for this tutorial. I’m using the infos for my app that download a large json. but I’ve question: how can i manage relationships (one-one, one-many and many-many)?
HI @rufy, unfortunately one of the limitations of batch insert requests is that they can’t set relationships, but they’ll leave existing relationships untouched. This means you’ll need to perform relationship setting manually. But, once the relationships are set, the entity data can be updated safely without breaking the relations.
There can be more than one way to control relationships. If you have a way to define which child entities belong to which parent entities in the data you’re downloading, maybe with a parent ID, after you have finished the batch insert you might be able to use that parent ID to group child entities together in a fetch request and set their parent manually.
Of course it’ll depend on how many entities and how much time that’ll take.
@atetlaw Your idea is interesting, but unfortunately I’ve some problem to inserting the data of 2 tables. The problem with your tutorial is that unfortunately you only enter data in one table. If your tutorial inserted data into a database consisting eg of 3 tables (a father and 2 children), I think it would be much more complete and useful. and you would see not only the limit of the request but also how it affects the timing.
Please, do you have a way to post sample code showing how to handle these cases here in the forum for the benefit of Raywnderlich readers?
I would be grateful.
my app is a bit like the Fireball app. I’m going to add data and the timing of data entry is quite important
@atetlaw
I’m trying to apply your advice, and I have a question: Is it possible to insert more than one batch insert request in your method? because each batch insert request has a specific entity.
private func newBatchInsertRequest(with fireballs: [FireballData]) -> NSBatchInsertRequest {
// 1
var index = 0
let total = fireballs.count
// 2
let batchInsert = NSBatchInsertRequest( entity: Fireball.entity()) { (managedObject: NSManagedObject) -> Bool in
// 3
guard index < total else { return true }
if let fireball = managedObject as? Fireball {
// 4
let data = fireballs[index]
fireball.dateTimeStamp = data.dateTimeStamp
fireball.radiatedEnergy = data.radiatedEnergy
fireball.impactEnergy = data.impactEnergy
fireball.latitude = data.latitude
fireball.longitude = data.longitude
fireball.altitude = data.altitude
fireball.velocity = data.velocity
}
// 5
index += 1
return false
}
return batchInsert
}
or do I have to create as many methods as there are tables that I download? in that case, I’ll have to multiply the context.execute (_ request: ) command as well?
Hi @rufy , here’s a quick answer, if I understand your questions clearly:
yes, each batch insert request can only insert 1 entity type
in the article I made a batchInsertFireballs method only because I only had 1 entity type as you pointed out. But you can make your own method to replace it and within the call container.performBackgroundTask {...} you can create and execute as many batch inserts as you like. But of course they still won’t update the relationships unless you specifically do that.
If you would be able to post some code you’re working on I might understand your situation a little better, and can give a better suggestion.
I try to simplify my situation in two examples, so everybody who read this post che get help.
Premise:
I download data from my server and decode the result JSON. After decoding I’ve an array that contains:
JSON:[
Radicals:[RadicalData]
Kanji:[KanjiData]*
]
Kanji
with relationship one-to-many (one radical has many kanji; one kanji has only one radical).
So, to insert every data correctly I have to insert first Radical and after Kanji, but when I insert kanji I would like to insert its radical.
func newBatchInsertRequest(with jsonDatabase: [JaappJSON], context:NSManagedObjectContext) -> NSBatchInsertRequest? {
let database = jsonDatabase[0]
if let radicali = database.radicali {
var index = 0
let total = radicali.count
let batchInsert = NSBatchInsertRequest(entity: Radicale.entity()) { (managedObject: NSManagedObject) -> Bool in
guard index < total else { return true }
if let radicale = managedObject as? Radicale {
// 4
let data = radicali[index]
radicale.concettoEN = data.concettoEN
radicale.concettoITA = data.concettoITA
radicale.nomeKana = data.nomeKana
radicale.numTratti = data.numTratti!
radicale.radicale = data.radicale
radicale.varianti = data.varianti
}
index += 1
return false
}
return batchInsert
}
if let kanjiList = database.kanji {
var index = 0
let total = kanjiList.count
let batchInsert = NSBatchInsertRequest(entity: Kanji.entity()) { (managedObject: NSManagedObject) -> Bool in
guard index < total else { return true }
if let kanji = managedObject as? Kanji {
let data = kanjiList[index]
kanji.concettoEN = data.concettoEn
kanji.concettoITA = data.concettoIta
kanji.confrontaCon = data.confrontaCon
kanji.contrario = data.contrario
kanji.frequenza = data.frequenza!
kanji.idDatabase = data.idDatabase!
kanji.idHalpern = data.idHelpern!
kanji.idNewNelson = data.idNewnelson!
kanji.jlptLevel = data.jlptLevel!
kanji.jouyouGrade = data.jouyouGrade!
kanji.kanji = data.kanji
kanji.numTratti = data.numTratti!
kanji.radicale = data.radicale
kanji.spiegConcettoEN = data.spiegConcettoEn
kanji.spiegConcettoITA = data.spiegConcettoIta
kanji.unicode = data.unicode
do {
let requestRadicale:NSFetchRequest<Radicale> = Radicale.fetchRequest()
let listaRadicali = try context.fetch(requestRadicale)
kanji.radicaleRelazione = listaRadicali[0]
} catch {
kanji.radicaleRelazione = nil
}
//Here situation 2
}
index += 1
return false
}
return batchInsert
}
return nil
}
Situation 2:
I’ve three Entities:
Kanji
KunYomi
OnYomi
with relationship one-to-many (one kanji has many kunYomi; one kanji has many onYomi).
As you can seen in the premise, kunYomi and OnYomi the belong to kanji are inside l’object KanjiData. So in this situation I would like to set the relationship when I insert the kanji.
Situation 2 Code List:
if let kunYomiSet = kanji.lettureKun {
let kunYomiList = kunYomiSet.allObjects as! [KunYomiData]
var indexKunYomi = 0
let totalKunYomi = kunYomiList.count
//TODO: inserisci kun yomi
_ = NSBatchInsertRequest(entity: KunYomi.entity()) { (managedObject:NSManagedObject) -> Bool in
guard indexKunYomi < totalKunYomi else { return true }
if let kunYomi = managedObject as? KunYomi {
let data = kunYomiList[index]
kunYomi.kunYomiKana = data.kunYomiKana
kunYomi.kunYomiKanji = data.kunYomiKanji
kunYomi.kunYomiRomaji = data.kunYomiRomaji
kunYomi.kunYomiVideo = data.kunYomiVideo
kanji.addToLettureKun(kunYomi)
}
indexKunYomi += 1
return false
}
}
if let onYomiSet = kanji.lettureOn {
let onYomiList = onYomiSet.allObjects as! [OnYomiData]
var indexOnYomi = 0
let totalOnYomi = onYomiList.count
_ = NSBatchInsertRequest(entity: OnYomi.entity(), managedObjectHandler: { (managedObject:NSManagedObject) -> Bool in
guard indexOnYomi < totalOnYomi else { return true }
if let onYomi = managedObject as? OnYomi {
let data = onYomiList[index]
onYomi.onYomi = data.onYomi
onYomi.onYomiRomaji = data.onYomiRomaji
kanji.addToLettureOn(onYomi)
}
indexOnYomi += 1
return false
})
}
How can I do it?
I hope I have well explained the two situations I have to face in order to make the correct insertion.
It’s a little hard for me to give you good advice because I don’t know the quantity of data you’re inserting, and I can’t see how the relationships work - how do you know which radical belongs to which Kanji?
My first bit of advice would be to perform all the inserts first before you set the relationships, I suspect there’s a problem when trying to fetch data you’re inserting with a batch insert, while executing the insert. You’ll need to finish the execution of the batch inserts, and then perform a new task and get a new managed object context before you can fetch any of that data.
That’s sounding tricky to me, and now I’m wondering if batch insert requests are really suitable to your situation?
Hi @atetlaw, actually I need to insert more 30000 records.
I can’t see how the relationships work - how do you know which radical belongs to which Kanji?
Radical Entity has an attribute called ‘radical’ that contain the character.
Kanji Entity has an attribute called ‘radical’ that contain the character.
but you can you the tipica method with attribute id and id_radical, if for you is better.
That’s sounding tricky to me, and now I’m wondering if batch insert requests are really suitable to your situation?
This is a great question. but it’s very strange that there isn’t a way to set quickly the relationship.
let me explain:my problem isn’t how to set the relationship. Because if read the code that I posted in this post NSBatchInsertRequest and Relationship
i do what you suggest: first the inserting and after the relationship.
My problem is the timing. as I already said the inserting without set relationship lasts only 5 seconds. The setting of relationship 35 seconds. to do this I loops on all radical and look for the kanji object inside the array of object and set the relationship. I know that the filter function has a O(n) of time complexity. so is this the problem? is there an alternative to filter function? maybe I have use the FetchRequest?
I understand that thinking of a database that you don’t have can be complicated. So try to think of a database made up of different entities related to each other (It doesn’t have to be big. 3, 4 entities are fine too). and imagine that the database is populated with 30-40000 records. How would you enter all those records and how would you set up relationships in the shortest time possible? Keep in mind that the user has to wait for the download and database population to finish before using the app (and that’s where the difficulty lies. Time.)
How would you do it?
Do you need to load 30,000 kanji on every launch or only the first launch?
If only on first launch and the initial set of Kanji data is always the same, you could consider preloading it in your app bundle. That way when launching for the first time all the initial data is already there.
You mention that each radical has multiple kanji, but each kanji only has 1 radical? If so it may be more efficient to set the relationship from the radical side: loop through the radicals and use the kanji IDs to fetch all relevant Kanji, then set the radical’s kanjis property (or whatever that’s called in your model) to the set of Kanji you fetched. Using a predicate like id IN [ID array] is an efficient fetch in my experience. This might mean less work if it results in less fetches.
If the batch inserts are that fast, then you could also consider avoiding setting the CoreData relations. That means you’ll have 4 tables, and you can display the related items in your app using fetch requests that just use the IDs of the items. For example if you have a Kanji display screen, then just fetch the appropriate radical using the id_radical value. So you can display it. You loose part of CoreData’s object relationship management, but it’s much faster to import! That would depend on the user experience you’re going for.
Wanting to continue the example (The reason is because the information is not only Kanji), my idea is to download the database at the first start of the app, then the app will download the updates that I publish. So from this you can already understand that 30,000 is now, but later on it is much more.
I understand your idea, but as I said earlier the starter set will not be static. I am going to insert new data into the database. If I publish the app with a bundled dataset, it means I’ll have to keep it over time. I prefer to keep databases and app functions separate.
I have already implemented this idea for the reasons you mentioned. the only problem I have is that with the current code I cycle over the radical array and for each radical filter the kanji array to find the kanji. if I had to loop over the radical array and fetch the predicate to find all arrays, the app crashes on the second round. I still don’t understand why. keep in mind that at the time of setting relationships I have not yet saved the context. Could this be the problem?
however, in the second case in which the lemma has many writings and many meanings, the thing is different. we are talking about 7000 headwords and 7000 scripts and 7000 meanings. and it is this case that takes more than 20-30 seconds.
I also understand this solution very well. and on the one hand I would be tempted. But this is another code to keep in order to link two entities. Core data makes it much easier for you to use entities and relationships as if they were classes, that losing this feature … I don’t know how convenient it can be. At the moment the version of the app that is currently online uses the private context but does not use the batchInsertRequest, and currently the download and population duration is around 35 seconds. I wanted to use the bachInsertRequest to be able to speed up the whole process. and if I don’t take into account the relationships, I can (about 30000 records are saved in about 5 seconds. Very fast!) But is it possible that setting a relationship is so expensive from a computational point of view? is it the for loop? should i use mapping?
I have to think about it well. If you have other suggestions, they are welcome. In any case, I want to thank you for the various suggestions you have given me. I hope that together we can find a solution. obviously if I find something, I’ll let you know immediately.
Batch inserts are super fast because they operate at the database level, and no data is loaded into memory. CoreData fetching is fast, but it does involve moving data into memory and when you have so much data to process that’s what slows it down. Unfortunately modifying 30,000 records 1 at a time will just take that long. I thought 35 seconds was pretty good for how much work there is to do !
A couple of things can make fetching slower (but you’ll need to test if they do):
avoid reading data from the fetched record if all you want to do is set the relationship. OR if you know you need to read a property on the record set returnsObjectsAsFaults to false on the fetch requests.
avoid using performAndWait on the NSManagedObjectContext, prefer perform instead.
The other thing I’d suggest is to batch import and then update the relationships in batches. Perform each batch as a seperate operation. At least then you’ll have something appearing in the UI, and your user can start using the app. You’ll just have to be happy with progressive loading.
This is a really useful article. Well written and easy to follow. Thank you. The challenge was set to try adding code to delete the unnecessary transaction history after it’s already been processed. I was concerned about only deleting transactions with the correct author. Perhaps this is not an issue. I would be grateful to know how this is best done. I was thinking of using the newToken directly after after storing it with deleteHistory(before token: [NSPersistentHistoryToken]). However, might I delete history not associated with the correct author?
Hi. BatchInsert does not seem to update existing records. For example, if I download this project and do the following:
Refresh to download teh fireball data - works well
Modify the line fireball.altitude = data.altitude to fireball.altitude = 27 and press refresh again, then I would expect all the altitudes to change to 27, since the correct merge policy is set. However, this doesn’t work. The altitudes don’t change even on restarting.
Sorry, I didn’t explain myself well. I was suggesting the merge policy is not working as expected in the downloaded project. I did get my own code to work based on this article - thank you.
Hi @atetlaw I have a question: the fact that nsBatchInsertRequest cannot be used to modify relationships between two or more entities, where it is written in the official Apple documentation?