coroutineScope: waits for all its children before it can finish itself (when children do not throw)
Mix of await and launch in coroutineScope, coroutineScope waits for all children to finish.
Code
package com.glassthought.sandbox
import gt.sandbox.util.output.Out
import kotlinx.coroutines.async
import kotlinx.coroutines.coroutineScope
import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking
import kotlin.system.exitProcess
import kotlin.time.Duration.Companion.seconds
private val out = Out.standard()
private suspend fun mainImpl(out: Out) {
coroutineScope {
foo("msg-1")
}
}
private suspend fun foo(msg: String) {
out.actionWithMsg("functionThatStartsCoRoutineScope()", { functionThatStartsCoRoutineScope(msg) })
}
private suspend fun functionThatStartsCoRoutineScope(msg: String) {
coroutineScope {
out.info("This coroutineScope suspends the calling function.")
async {
out.delayNamed(3.seconds, "delayed([${msg}])")
}
async {
out.delayNamed(2.seconds, "delayed([${msg}])")
}
launch {
out.delayNamed(1.seconds, "delayed([${msg}])")
}
out.info("Just launched co-routines")
}
}
fun main(): Unit = runBlocking {
out.info("START - ON MAIN")
try {
mainImpl(out)
} catch (e: Exception) {
out.error("back at MAIN got an exception! of type=[${e::class.simpleName}] with msg=[${e.message}] cause=[${e.cause}]. Exiting with error code 1")
exitProcess(1)
}
out.info("DONE - WITHOUT errors on MAIN")
}
Command to reproduce:
gt.sandbox.checkout.commit 295a8f8cfa333633ea0a \
&& cd "${GT_SANDBOX_REPO}" \
&& cmd.run.announce "./gradlew run --quiet"
Recorded output of command:
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
[INFO][elapsed: 38ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] START - ON MAIN
[INFO][elapsed: 54ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] [>] Starting action=[functionThatStartsCoRoutineScope()]
[INFO][elapsed: 54ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] This coroutineScope suspends the calling function.
[INFO][elapsed: 56ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] Just launched co-routines
[INFO][elapsed: 59ms][🥇][⓶][coroutname:@coroutine#2][tname:main/tid:1] Delaying for 3s what_for=[delayed([msg-1])]
[INFO][elapsed: 61ms][🥇][⓷][coroutname:@coroutine#3][tname:main/tid:1] Delaying for 2s what_for=[delayed([msg-1])]
[INFO][elapsed: 61ms][🥇][⓸][coroutname:@coroutine#4][tname:main/tid:1] Delaying for 1s what_for=[delayed([msg-1])]
[INFO][elapsed: 1063ms][🥇][⓸][coroutname:@coroutine#4][tname:main/tid:1] Done delaying for 1s what_for=[delayed([msg-1])]
[INFO][elapsed: 2061ms][🥇][⓷][coroutname:@coroutine#3][tname:main/tid:1] Done delaying for 2s what_for=[delayed([msg-1])]
[INFO][elapsed: 3060ms][🥇][⓶][coroutname:@coroutine#2][tname:main/tid:1] Done delaying for 3s what_for=[delayed([msg-1])]
[INFO][elapsed: 3061ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] [>] Finished action=[functionThatStartsCoRoutineScope()].
[INFO][elapsed: 3061ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] DONE - WITHOUT errors on MAIN
Remember that: coroutineScope: waits for all its children before it can finish itself (when children do not throw)
This creates 2 potential gotchas IF we mix essential and non essential calls:
- Essential data could be slowed down due to waiting for non essential function to finish.
- Essential data can be blocked from returning if non essential data fails.
Example: slow down
Here is an example where after get the required data we spawn a launch
to notify metrics. We do it in the same scope and hence, we end up delaying data being shown to the user until we notify metrics.
GT-Sandbox-Snapshot
Code
package com.glassthought.sandbox
import com.glassthought.sandbox.util.out.impl.out
import gt.sandbox.util.output.Emojis
import gt.sandbox.util.output.Out
import kotlinx.coroutines.CoroutineName
import kotlinx.coroutines.async
import kotlinx.coroutines.coroutineScope
import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking
import kotlin.system.exitProcess
import kotlin.time.Duration.Companion.milliseconds
private suspend fun mainImpl(out: Out) {
val useCase = GetUserDataUseCase(UserDataRepository())
val userData = useCase.getUserData()
out.info(
"User data that can be used by user: $userData"
)
}
class GetUserDataUseCase(
private val repo: UserDataRepository,
) {
suspend fun getUserData() = coroutineScope {
val name = async(CoroutineName("getName()")) { repo.getName() }
val friends = async(CoroutineName("getFriends()")) { repo.getFriends() }
val profile = async(CoroutineName("getProfile()")) { repo.getProfile() }
// Await to get all the user data - ESSENTIAL call!
val user = User(
name = name.await(),
friends = friends.await(),
profile = profile.await()
)
out.info("We have all the required user data, let's just notify metrics...")
// Notify metric system that we got the user data.
// NON-ESSENTIAL call.
// - 1) we end up blocking prior to returning essential user data.
// - 2) if notifyMetricsOfUserDataLoaded() fails the USER data will NOT be returned.
launch(CoroutineName("notifyMetricsOfUserDataLoaded()")) { repo.notifyMetricsOfUserDataLoaded() }
user
}
}
fun main(): Unit = runBlocking {
out.info("START - ON MAIN")
try {
mainImpl(out)
} catch (e: Exception) {
out.error("EXCEPTION AT MAIN! [${e::class.simpleName}] with msg=[${e.message}]!")
exitProcess(1)
}
out.info("DONE - WITHOUT errors on MAIN")
}
class UserDataRepository {
suspend fun getName(): String {
return out.actionWithMsg(
"getName()", { "Tom" }, 500.milliseconds
)
}
suspend fun getFriends() {
return out.actionWithMsg(
"getFriends()", { listOf("Alice", "Bob") }, 1000.milliseconds
)
}
suspend fun getProfile() {
return out.actionWithMsg(
"getProfile()", { "Profile of Tom" }, 1500.milliseconds
)
}
suspend fun notifyMetricsOfUserDataLoaded() {
return out.actionWithMsg(
"${Emojis.TURTLE}notifyMetricsOfUserDataLoaded()${Emojis.TURTLE}", { "metrics-updated" }, 5000.milliseconds
)
}
}
class User(name: String, friends: Any, profile: Any)
Command to reproduce:
gt.sandbox.checkout.commit 9771a4493454f0700e2d \
&& cd "${GT_SANDBOX_REPO}" \
&& cmd.run.announce "./gradlew run --quiet"
Recorded output of command:
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
[INFO][elapsed: 14ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] START - ON MAIN
[INFO][elapsed: 35ms][🥇][⓶][coroutname:@getName()#2][tname:main/tid:1] [>] Starting action=[getName()] with delay before action of [500 ms]
[INFO][elapsed: 37ms][🥇][⓷][coroutname:@getFriends()#3][tname:main/tid:1] [>] Starting action=[getFriends()] with delay before action of [1000 ms]
[INFO][elapsed: 37ms][🥇][⓸][coroutname:@getProfile()#4][tname:main/tid:1] [>] Starting action=[getProfile()] with delay before action of [1500 ms]
[INFO][elapsed: 537ms][🥇][⓶][coroutname:@getName()#2][tname:main/tid:1] [<] Finished action=[getName()].
[INFO][elapsed: 1037ms][🥇][⓷][coroutname:@getFriends()#3][tname:main/tid:1] [<] Finished action=[getFriends()].
[INFO][elapsed: 1537ms][🥇][⓸][coroutname:@getProfile()#4][tname:main/tid:1] [<] Finished action=[getProfile()].
[INFO][elapsed: 1538ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] We have all the required user data, let's just notify metrics...
[INFO][elapsed: 1539ms][🥇][⓹][coroutname:@notifyMetricsOfUserDataLoaded()#5][tname:main/tid:1] [>] Starting action=[🐢notifyMetricsOfUserDataLoaded()🐢] with delay before action of [5000 ms]
[INFO][elapsed: 6539ms][🥇][⓹][coroutname:@notifyMetricsOfUserDataLoaded()#5][tname:main/tid:1] [<] Finished action=[🐢notifyMetricsOfUserDataLoaded()🐢].
[INFO][elapsed: 6540ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] User data that can be used by user: com.glassthought.sandbox.User@34b7bfc0
[INFO][elapsed: 6540ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] DONE - WITHOUT errors on MAIN
Example: non essential call preventing Essential
Now let's take the same example but instead of just delaying the metric notification first delays and then throws an exception.
Now not only did we make the user wait, they waited for a crash and never saw their data:
GT-Sandbox-Snapshot
Code
package com.glassthought.sandbox
import com.glassthought.sandbox.util.out.impl.out
import gt.sandbox.util.output.Emojis
import gt.sandbox.util.output.Out
import kotlinx.coroutines.CoroutineName
import kotlinx.coroutines.async
import kotlinx.coroutines.coroutineScope
import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking
import kotlin.system.exitProcess
import kotlin.time.Duration.Companion.milliseconds
private suspend fun mainImpl(out: Out) {
val useCase = GetUserDataUseCase(UserDataRepository())
val userData = useCase.getUserData()
out.info(
"User data that can be used by user: $userData"
)
}
class GetUserDataUseCase(
private val repo: UserDataRepository,
) {
suspend fun getUserData() = coroutineScope {
val name = async(CoroutineName("getName()")) { repo.getName() }
val friends = async(CoroutineName("getFriends()")) { repo.getFriends() }
val profile = async(CoroutineName("getProfile()")) { repo.getProfile() }
// Await to get all the user data - ESSENTIAL call!
val user = User(
name = name.await(),
friends = friends.await(),
profile = profile.await()
)
out.info("We have all the required user data, let's just notify metrics...")
// Notify metric system that we got the user data.
// NON-ESSENTIAL call.
// - 1) we end up blocking prior to returning essential user data.
// - 2) if notifyMetricsOfUserDataLoaded() fails the USER data will NOT be returned.
launch(CoroutineName("notifyMetricsOfUserDataLoaded()")) { repo.notifyMetricsOfUserDataLoaded() }
user
}
}
fun main(): Unit = runBlocking {
out.info("START - ON MAIN")
try {
mainImpl(out)
} catch (e: Exception) {
out.error("EXCEPTION AT MAIN! [${e::class.simpleName}] with msg=[${e.message}]!")
exitProcess(1)
}
out.info("DONE - WITHOUT errors on MAIN")
}
class UserDataRepository {
suspend fun getName(): String {
return out.actionWithMsg(
"getName()", { "Tom" }, 500.milliseconds
)
}
suspend fun getFriends() {
return out.actionWithMsg(
"getFriends()", { listOf("Alice", "Bob") }, 1000.milliseconds
)
}
suspend fun getProfile() {
return out.actionWithMsg(
"getProfile()", { "Profile of Tom" }, 1500.milliseconds
)
}
suspend fun notifyMetricsOfUserDataLoaded() {
return out.actionWithMsg(
"${Emojis.TURTLE}notifyMetricsOfUserDataLoaded()${Emojis.TURTLE}", {
throw MyRuntimeException.create(
"metric-system-is-down",
out
)
}, 5000.milliseconds
)
}
}
class User(name: String, friends: Any, profile: Any)
Command to reproduce:
gt.sandbox.checkout.commit e74dc0d38ef9f00c9457 \
&& cd "${GT_SANDBOX_REPO}" \
&& cmd.run.announce "./gradlew run --quiet"
Recorded output of command:
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
[INFO][elapsed: 20ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] START - ON MAIN
[INFO][elapsed: 47ms][🥇][⓶][coroutname:@getName()#2][tname:main/tid:1] [>] Starting action=[getName()] with delay before action of [500 ms]
[INFO][elapsed: 48ms][🥇][⓷][coroutname:@getFriends()#3][tname:main/tid:1] [>] Starting action=[getFriends()] with delay before action of [1000 ms]
[INFO][elapsed: 49ms][🥇][⓸][coroutname:@getProfile()#4][tname:main/tid:1] [>] Starting action=[getProfile()] with delay before action of [1500 ms]
[INFO][elapsed: 549ms][🥇][⓶][coroutname:@getName()#2][tname:main/tid:1] [<] Finished action=[getName()].
[INFO][elapsed: 1049ms][🥇][⓷][coroutname:@getFriends()#3][tname:main/tid:1] [<] Finished action=[getFriends()].
[INFO][elapsed: 1549ms][🥇][⓸][coroutname:@getProfile()#4][tname:main/tid:1] [<] Finished action=[getProfile()].
[INFO][elapsed: 1550ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] We have all the required user data, let's just notify metrics...
[INFO][elapsed: 1551ms][🥇][⓹][coroutname:@notifyMetricsOfUserDataLoaded()#5][tname:main/tid:1] [>] Starting action=[🐢notifyMetricsOfUserDataLoaded()🐢] with delay before action of [5000 ms]
[WARN][elapsed: 6582ms][🥇][⓹][coroutname:@notifyMetricsOfUserDataLoaded()#5][tname:main/tid:1] 💥 throwing exception=[MyRuntimeException] with msg=[metric-system-is-down]
[WARN][elapsed: 6582ms][🥇][⓹][coroutname:@notifyMetricsOfUserDataLoaded()#5][tname:main/tid:1] [<][💥] Finished action=[🐢notifyMetricsOfUserDataLoaded()🐢], threw exception of type=[MyRuntimeException].
[ERROR][elapsed: 6587ms][🥇][①][coroutname:@coroutine#1][tname:main/tid:1] EXCEPTION AT MAIN! [MyRuntimeException] with msg=[metric-system-is-down]!
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':app:run'.
> Process 'command '/home/nickolaykondratyev/.jdks/corretto-21.0.7/bin/java'' finished with non-zero exit value 1
* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.
BUILD FAILED in 7s
How to fix?
When you have an additional (non-essential) operation that should not influence the main process, it is better to start it on a separate scope.
Creating your own scope is easy.[author goes on to to create a scope that does not respect parent cancellation in his example] - Kotlin Coroutines Deep Dive
Let's fix our example with addition of hand created metric CoroutineScope that is meant to be used in the background. Our metric scope has these characteristics:
- Uses Supervisor-Job so that it doesn't get cancelled on failures.
- Uses CoroutineExceptionHandler so that we process background errors.
Initial attempt that introduces parent cancellation issue:
If we attempt to create create scope with SupervisorJob()
we end up with a scope that does not respect parent cancellation.
Example Illustrated below, where once we cancel parent job, our background worker continues to work.
GT-Sandbox-Snapshot
Code
package com.glassthought.sandbox
import com.glassthought.sandbox.util.out.impl.out
import gt.sandbox.util.output.Emojis
import gt.sandbox.util.output.Out
import kotlinx.coroutines.CancellationException
import kotlinx.coroutines.CoroutineExceptionHandler
import kotlinx.coroutines.CoroutineName
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.SupervisorJob
import kotlinx.coroutines.async
import kotlinx.coroutines.cancel
import kotlinx.coroutines.coroutineScope
import kotlinx.coroutines.delay
import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking
import kotlin.time.Duration.Companion.milliseconds
import kotlin.time.Duration.Companion.seconds
private suspend fun mainImpl(out: Out) {
val metricsScope = createMetricsScope(out)
val useCase = GetUserDataUseCase(UserDataRepository(), metricsScope)
val userData = useCase.getUserData()
out.info("User data that can be used by user: $userData")
delay(3000.milliseconds)
}
private suspend fun createMetricsScope(out: Out): CoroutineScope {
val coroutineExceptionHandler = CoroutineExceptionHandler { coroutineContext, throwable ->
runBlocking {
out.warn("background failure handled, failure_message=[" + throwable.message + "] in coroutine_context=[" + coroutineContext + "]")
}
}
// [⚠️ +Job()/+SupervisorJob() must-inherit-from-parent to respect parent cancellation ⚠️](http://www.glassthought.com/notes/678f18et3ek7l31u2g5kxte) just creating a SupervisorJob() is prblematic, since it does not inherit from parent scope. and will not cancel children on parent cancellation.
return CoroutineScope(
SupervisorJob()
+ CoroutineName("MetricsScope")
+ coroutineExceptionHandler
)
}
class GetUserDataUseCase(
private val repo: UserDataRepository,
private val metricsScope: CoroutineScope,
) {
suspend fun getUserData() = coroutineScope {
val name = async(CoroutineName("getName()")) { repo.getName() }
val friends = async(CoroutineName("getFriends()")) { repo.getFriends() }
val profile = async(CoroutineName("getProfile()")) { repo.getProfile() }
// Await to get all the user data - ESSENTIAL call!
val user = User(
name = name.await(),
friends = friends.await(),
profile = profile.await()
)
out.info("We have all the required user data, let's just notify metrics...")
// Notify metric system that we got the user data. On another scope.
metricsScope.launch(
CoroutineName("notifyMetricsOfUserDataLoaded()")
) { repo.notifyMetricsOfUserDataLoaded() }
user
}
}
fun main(): Unit {
// NOTE: nested runBlocking is an ANTI-PATTERN! but it makes it easier to
// illustrate the issue with SupervisorJob() not inheriting from parent scope.
// runBlocking-l1
runBlocking(CoroutineName("runBlocking-l1")) {
try {
// runBlocking-l2
runBlocking(CoroutineName("runBlocking-l2")) {
val runBlockingL2 = this
launch {
out.actionWithMsg(
"cancel runBlocking-l2 scope", {
runBlockingL2.cancel(
CancellationException("cancel entire runBlocking-l2 scope")
)
}, delayDuration = 2.seconds
)
}
try {
mainImpl(out)
out.info("DONE - runBlocking-l2 - mainImpl() finished without exceptions!")
} catch (e: Exception) {
out.error("EXCEPTION AT runBlocking-l2 [${e::class.simpleName}] with msg=[${e.message}]!")
}
}
} catch (e: Exception) {
out.error("runBlocking-l2 threw error that we caught [${e::class.simpleName}] with msg=[${e.message}] (inRunBlocking-l1)!")
}
out.delayNamed(5.seconds, "Delay for background scope to finish")
}
}
class UserDataRepository {
suspend fun getName(): String {
return out.actionWithMsg(
"getName()", { "Tom" }, 500.milliseconds
)
}
suspend fun getFriends() {
return out.actionWithMsg(
"getFriends()", { listOf("Alice", "Bob") }, 1000.milliseconds
)
}
suspend fun getProfile() {
return out.actionWithMsg(
"getProfile()", { "Profile of Tom" }, 1500.milliseconds
)
}
suspend fun notifyMetricsOfUserDataLoaded() {
return out.actionWithMsg(
"${Emojis.TURTLE}notifyMetricsOfUserDataLoaded()${Emojis.TURTLE}", {
throw MyRuntimeException.create(
"metric-system-is-down",
out
)
}, 5000.milliseconds
)
}
}
data class User(val name: String, val friends: Any, val profile: Any)
Command to reproduce:
gt.sandbox.checkout.commit d501d9b796c8f36becb3 \
&& cd "${GT_SANDBOX_REPO}" \
&& cmd.run.announce "./gradlew run --quiet"
Recorded output of command:
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
[INFO][elapsed: 28ms][🥇][①][coroutname:@runBlocking-l2#3][tname:main/tid:1] [>] Starting action=[cancel runBlocking-l2 scope] with delay before action of [2000 ms]
[INFO][elapsed: 40ms][🥇][⓶][coroutname:@getName()#4][tname:main/tid:1] [>] Starting action=[getName()] with delay before action of [500 ms]
[INFO][elapsed: 40ms][🥇][⓷][coroutname:@getFriends()#5][tname:main/tid:1] [>] Starting action=[getFriends()] with delay before action of [1000 ms]
[INFO][elapsed: 40ms][🥇][⓸][coroutname:@getProfile()#6][tname:main/tid:1] [>] Starting action=[getProfile()] with delay before action of [1500 ms]
[INFO][elapsed: 541ms][🥇][⓶][coroutname:@getName()#4][tname:main/tid:1] [<] Finished action=[getName()].
[INFO][elapsed: 1041ms][🥇][⓷][coroutname:@getFriends()#5][tname:main/tid:1] [<] Finished action=[getFriends()].
[INFO][elapsed: 1541ms][🥇][⓸][coroutname:@getProfile()#6][tname:main/tid:1] [<] Finished action=[getProfile()].
[INFO][elapsed: 1541ms][🥇][⓹][coroutname:@runBlocking-l2#2][tname:main/tid:1] We have all the required user data, let's just notify metrics...
[INFO][elapsed: 1547ms][2️⃣][⓺][coroutname:@notifyMetricsOfUserDataLoaded()#7][tname:DefaultDispatcher-worker-1/tid:31] [>] Starting action=[🐢notifyMetricsOfUserDataLoaded()🐢] with delay before action of [5000 ms]
[INFO][elapsed: 1548ms][🥇][⓹][coroutname:@runBlocking-l2#2][tname:main/tid:1] User data that can be used by user: User(name=Tom, friends=kotlin.Unit, profile=kotlin.Unit)
[INFO][elapsed: 2040ms][🥇][①][coroutname:@runBlocking-l2#3][tname:main/tid:1] [<] Finished action=[cancel runBlocking-l2 scope].
[ERROR][elapsed: 2084ms][🥇][⓹][coroutname:@runBlocking-l2#2][tname:main/tid:1] EXCEPTION AT runBlocking-l2 [CancellationException] with msg=[cancel entire runBlocking-l2 scope]!
[ERROR][elapsed: 2084ms][🥇][⓻][coroutname:@runBlocking-l1#1][tname:main/tid:1] runBlocking-l2 threw error that we caught [CancellationException] with msg=[cancel entire runBlocking-l2 scope] (inRunBlocking-l1)!
[INFO][elapsed: 2085ms][🥇][⓻][coroutname:@runBlocking-l1#1][tname:main/tid:1] Delaying for 5s what_for=[Delay for background scope to finish]
[WARN][elapsed: 6549ms][2️⃣][⓺][coroutname:@notifyMetricsOfUserDataLoaded()#7][tname:DefaultDispatcher-worker-1/tid:31] 💥 throwing exception=[MyRuntimeException] with msg=[metric-system-is-down]
[WARN][elapsed: 6550ms][2️⃣][⓺][coroutname:@notifyMetricsOfUserDataLoaded()#7][tname:DefaultDispatcher-worker-1/tid:31] [<][💥] Finished action=[🐢notifyMetricsOfUserDataLoaded()🐢], threw exception of type=[MyRuntimeException].
[WARN][elapsed: 6551ms][2️⃣][⓼][coroutname:@coroutine#8][tname:DefaultDispatcher-worker-1/tid:31] background failure handled, failure_message=[metric-system-is-down] in coroutine_context=[[com.glassthought.sandbox.MainKt$createMetricsScope$$inlined$CoroutineExceptionHandler$1@74c218a0, CoroutineName(notifyMetricsOfUserDataLoaded()), CoroutineId(7), "notifyMetricsOfUserDataLoaded()#7":StandaloneCoroutine{Cancelling}@40ca41e1, Dispatchers.Default]]
[INFO][elapsed: 7085ms][🥇][⓻][coroutname:@runBlocking-l1#1][tname:main/tid:1] Done delaying for 5s what_for=[Delay for background scope to finish]
We dive into this issue more in +Job()/+SupervisorJob() must-inherit-from-parent or it will ⚠️not respect parent cancellation⚠️
Fix with SupervisorScope(parentJob)
In this fix we pass the parent job so that when we create the new scope we can create SupervisorJob(parentJob)
so that when we cancel parentJob
our background metric scope behaves and also cancels itself.
GT-Sandbox-Snapshot
Code
package com.glassthought.sandbox
import com.glassthought.sandbox.util.out.impl.out
import gt.sandbox.util.output.Emojis
import gt.sandbox.util.output.Out
import kotlinx.coroutines.CancellationException
import kotlinx.coroutines.CoroutineExceptionHandler
import kotlinx.coroutines.CoroutineName
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Job
import kotlinx.coroutines.SupervisorJob
import kotlinx.coroutines.async
import kotlinx.coroutines.cancel
import kotlinx.coroutines.coroutineScope
import kotlinx.coroutines.delay
import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking
import kotlin.time.Duration.Companion.milliseconds
import kotlin.time.Duration.Companion.seconds
private suspend fun mainImpl(out: Out, parentJob: Job) {
val metricsScope = createMetricsScope(out, parentJob)
val useCase = GetUserDataUseCase(UserDataRepository(), metricsScope)
val userData = useCase.getUserData()
out.info("User data that can be used by user: $userData")
delay(3000.milliseconds)
}
private suspend fun createMetricsScope(out: Out, parentJob: Job): CoroutineScope {
val coroutineExceptionHandler = CoroutineExceptionHandler { coroutineContext, throwable ->
runBlocking {
out.warn("background failure handled, failure_message=[" + throwable.message + "] in coroutine_context=[" + coroutineContext + "]")
}
}
return CoroutineScope(
SupervisorJob(parentJob)
+ CoroutineName("MetricsScope")
+ coroutineExceptionHandler
)
}
class GetUserDataUseCase(
private val repo: UserDataRepository,
private val metricsScope: CoroutineScope,
) {
suspend fun getUserData() = coroutineScope {
val name = async(CoroutineName("getName()")) { repo.getName() }
val friends = async(CoroutineName("getFriends()")) { repo.getFriends() }
val profile = async(CoroutineName("getProfile()")) { repo.getProfile() }
// Await to get all the user data - ESSENTIAL call!
val user = User(
name = name.await(),
friends = friends.await(),
profile = profile.await()
)
out.info("We have all the required user data, let's just notify metrics...")
// Notify metric system that we got the user data. On another scope.
metricsScope.launch(
CoroutineName("notifyMetricsOfUserDataLoaded()")
) { repo.notifyMetricsOfUserDataLoaded() }
user
}
}
fun main(): Unit {
// NOTE: nested runBlocking is an ANTI-PATTERN! but it makes it easier to
// illustrate the issue with SupervisorJob() not inheriting from parent scope.
//
// runBlocking-l1
runBlocking(CoroutineName("runBlocking-l1")) {
try {
// runBlocking-l2
runBlocking(CoroutineName("runBlocking-l2")) {
val runBlockingL2 = this
launch {
out.actionWithMsg(
"cancel runBlocking-l2 scope", {
runBlockingL2.cancel(
CancellationException("cancel entire runBlocking-l2 scope")
)
}, delayDuration = 2.seconds
)
}
try {
mainImpl(out, runBlockingL2.coroutineContext[Job]!!)
out.info("DONE - runBlocking-l2 - mainImpl() finished without exceptions!")
} catch (e: Exception) {
out.error("EXCEPTION AT runBlocking-l2 [${e::class.simpleName}] with msg=[${e.message}]!")
}
}
} catch (e: Exception) {
out.error("runBlocking-l2 threw error that we caught [${e::class.simpleName}] with msg=[${e.message}] (inRunBlocking-l1)!")
}
out.delayNamed(5.seconds, "Delay for background scope to finish")
}
}
class UserDataRepository {
suspend fun getName(): String {
return out.actionWithMsg(
"getName()", { "Tom" }, 500.milliseconds
)
}
suspend fun getFriends() {
return out.actionWithMsg(
"getFriends()", { listOf("Alice", "Bob") }, 1000.milliseconds
)
}
suspend fun getProfile() {
return out.actionWithMsg(
"getProfile()", { "Profile of Tom" }, 1500.milliseconds
)
}
suspend fun notifyMetricsOfUserDataLoaded() {
return out.actionWithMsg(
"${Emojis.TURTLE}notifyMetricsOfUserDataLoaded()${Emojis.TURTLE}", {
throw MyRuntimeException.create(
"metric-system-is-down",
out
)
}, 5000.milliseconds
)
}
}
data class User(val name: String, val friends: Any, val profile: Any)
Command to reproduce:
gt.sandbox.checkout.commit dc16b2ee27fe6d3c473b \
&& cd "${GT_SANDBOX_REPO}" \
&& cmd.run.announce "./gradlew run --quiet"
Recorded output of command:
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
Picked up JAVA_TOOL_OPTIONS: -Dkotlinx.coroutines.debug
[INFO][elapsed: 26ms][🥇][①][coroutname:@runBlocking-l2#3][tname:main/tid:1] [>] Starting action=[cancel runBlocking-l2 scope] with delay before action of [2000 ms]
[INFO][elapsed: 36ms][🥇][⓶][coroutname:@getName()#4][tname:main/tid:1] [>] Starting action=[getName()] with delay before action of [500 ms]
[INFO][elapsed: 37ms][🥇][⓷][coroutname:@getFriends()#5][tname:main/tid:1] [>] Starting action=[getFriends()] with delay before action of [1000 ms]
[INFO][elapsed: 37ms][🥇][⓸][coroutname:@getProfile()#6][tname:main/tid:1] [>] Starting action=[getProfile()] with delay before action of [1500 ms]
[INFO][elapsed: 538ms][🥇][⓶][coroutname:@getName()#4][tname:main/tid:1] [<] Finished action=[getName()].
[INFO][elapsed: 1037ms][🥇][⓷][coroutname:@getFriends()#5][tname:main/tid:1] [<] Finished action=[getFriends()].
[INFO][elapsed: 1537ms][🥇][⓸][coroutname:@getProfile()#6][tname:main/tid:1] [<] Finished action=[getProfile()].
[INFO][elapsed: 1538ms][🥇][⓹][coroutname:@runBlocking-l2#2][tname:main/tid:1] We have all the required user data, let's just notify metrics...
[INFO][elapsed: 1544ms][2️⃣][⓺][coroutname:@notifyMetricsOfUserDataLoaded()#7][tname:DefaultDispatcher-worker-1/tid:31] [>] Starting action=[🐢notifyMetricsOfUserDataLoaded()🐢] with delay before action of [5000 ms]
[INFO][elapsed: 1545ms][🥇][⓹][coroutname:@runBlocking-l2#2][tname:main/tid:1] User data that can be used by user: User(name=Tom, friends=kotlin.Unit, profile=kotlin.Unit)
[INFO][elapsed: 2037ms][🥇][①][coroutname:@runBlocking-l2#3][tname:main/tid:1] [<] Finished action=[cancel runBlocking-l2 scope].
[ERROR][elapsed: 2069ms][🥇][⓹][coroutname:@runBlocking-l2#2][tname:main/tid:1] EXCEPTION AT runBlocking-l2 [CancellationException] with msg=[cancel entire runBlocking-l2 scope]!
[WARN][elapsed: 2071ms][2️⃣][⓺][coroutname:@notifyMetricsOfUserDataLoaded()#7][tname:DefaultDispatcher-worker-1/tid:31] 🫡 I have caught [CancellationException/cancel entire runBlocking-l2 scope], and rethrowing it 🫡
[INFO][elapsed: 2071ms][2️⃣][⓺][coroutname:@notifyMetricsOfUserDataLoaded()#7][tname:DefaultDispatcher-worker-1/tid:31] [<][🫡] Cancellation Exception - rethrowing.
[ERROR][elapsed: 2071ms][🥇][⓻][coroutname:@runBlocking-l1#1][tname:main/tid:1] runBlocking-l2 threw error that we caught [CancellationException] with msg=[cancel entire runBlocking-l2 scope] (inRunBlocking-l1)!
[INFO][elapsed: 2072ms][🥇][⓻][coroutname:@runBlocking-l1#1][tname:main/tid:1] Delaying for 5s what_for=[Delay for background scope to finish]
[INFO][elapsed: 7072ms][🥇][⓻][coroutname:@runBlocking-l1#1][tname:main/tid:1] Done delaying for 5s what_for=[Delay for background scope to finish]
Children
Backlinks