How switching from sync to async almost costs my sanity

It is a sunny day. Kids are playing. I'm sitting in front of my laptop and decided to implement user management into my app. Sounds fun? Sounds fun!

But first a bit of a background to the project. It is built in python using FastAPI. I'm not gonna bother you with more details here, since this is not the topic of this post. Just in short: Everything is nice and synchronous when you set up FastAPI.

All my tests are passing, all the code is clean (i swear!). Now let me create that sweet new branch for implementing that user management.

So I hop right into the documentation.

Me: Full example. Perfect! Let's see... ahaa, should be easy to implement.

Narrator: It was not easy to implement.

After 3 hours it is all set and done. Database working, routes are protected. The code is not so clean but working. I'm proud. Now that everything is working and the manual testing is going great, time to update the tests, so the refactoring of the code won't break anything.

Narrator: It was already broken

I'm starting the tests, surely the user ones will break, but all the other stuff should be fine.

User Test (8/8) failed. Yeah fine, this is what I expected.

Accounts Test (5/5) failed. Weird...

Transaction Tests (12/12) failed. All of them?! How can this be?!

As always, let's first check the error messages. Something like greenlet_spawn has not been called. *Googling...*, alright that doesn't say anything, but there are some solutions I can understand.

doesn't get it at all

Narrator: He didn't understand it at all

It just does not make any sense. How can it be, that my code is running just fine, when I'm testing it with Postman, but as soon as I run the tests, everything just breaks?!?!?!?!

At least I know that it has something to do with asynchronous. So time to go back to the full example in the docs.

To work with your DBMS, you'll need to install the corresponding asyncio driver. The common choices are:

Aha! I'm now using an asynchronous driver now. Now I wish I knew why my tests still were broken.

So again, let me fish in the dark, by just simply googling for pytest async tests. The results are showing up. Nothing I can make any sense of. Dang, it!

Alright since I'm using an asynchronous driver, this probably means it has something to do with the database, let's check sqlalchemy's docs for this.

Will you look at that?! It makes a bit more sense here. Great. Let's use some of their examples and adjust my code.

Narrator: Nope, this was not the solution

Error: [!] Object '<User at 0x7f12bc185a90>' is already attached to session '1' (this is '2')

Error: The garbage collector is trying to clean up connection <AdaptedConnection <asyncpg.connection.Connection object at 0x10c89e040

Error: RuntimeError: Task <Task pending name='Task-69' coro=<run_scoring_operation() running at ***> cb=[gather.<locals>._done_callback() at /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/asyncio/tasks.py:769]> got Future <Future pending> attached to a different loop

I've been struggling with this situation for a whole week now. The frustration builds up. The imposter syndrome is kicking in. I'm just annoyed by stumbling from one meaningless error into the other.

It is time for the ultimate hack, to get this solved: Step away from it for a day.

Narrator: And he was right.

Enjoying life for a day and I sat down again. Guille`s Theme started playing in the back of my head. I'm trying stuff out, searching the errors, reading docs. Full Hackerman Mode.

I am Hackerman

Suddenly BOOM! 💥

ASYNC SESSIONS! It is async sessions!! I finally know what is happening here. I finally found the weapon to slay this nasty bug out of existence!

Now with a few changes, my first tests passed again! A few more changes and the next passed. And the next. Passed, passed, PASSED!

All green. All set. All awesome!

Narrator: Time for an explanation

So what exactly happened here and what did I learn? Let's start with the easiest one: Why did my code work in the first place?

The answer to this is simple. Since I'm the only one 'using' my API for the local tests, it behaves like a synchronous application with synchronous database sessions.

Why did my tests fail?

Asynchronous session. Simply because the tests are running now in an asynchronous session and are executed simultaneously. My tests need to be refactored to reuse the session that was created at the start.

What did I learn from this?

From a technical perspective: Asynchronous sessions, with the statement in python and pytest.
From a personal perspective: I'm worthy. I'm a problem solver and on the right track.

Remember. If you are stuck in a similar situation (spoiler alert: you will). Do not give up, use your brain, take breaks and reach out for help.

If I did, I'd probably solve all of that way faster.

I've also collected my tweets regarding that journey here. Have fun:

First tweet about that bug
Bragging about solving the bug 1/4
Bragging about solving the bug 2/4
Bragging about solving the bug 3/4
Bragging about solving the bug 4/4