Da Bug Dat Broke Da Brain!
Finding and fixing bugs in your code is an essential software development skill. It is something I do pretty much every (working) day of my life. Sometimes I can dig my way straight to the problem. Sometimes it takes a bit of head-scratching and debugging to figure out what is going on. And sometimes, hopefully very rarely, I hit upon a nasty, bizarre bug that stumps me. The kind that makes me question my sanity. The kind that makes me feel like an idiot and a newbie.
I recently had such an experience. I was stuck for a couple months on a bug so bizarre that I feel compelled to write about it. I can only hope that someone somewhere, if they are misfortunate enough to fall into this situation, can benefit from my pain. Read on to hear my sad tale of confusion and frustration.
A Bit of Background
First, a quick recap of what my app Bravauto is about. Back in 2018, an old friend asked my opinion on an idea he had for a mobile app. He owns a used-car business and buys cars at auction to sell on his lot. Dealers have to be careful not to spend too much on a car bought at auction or they won’t be able to make a profit on it. This app idea was to allow the dealer to track cars they are interested in, recording the necessary repairs or improvements to get it ready for resale, as well as the dealer’s other expenses.
After being laid off that fall, I offered to take a stab at building this app. I also used it as a learning experience, switching from my long-time companion Ruby on Rails to Elixir and the Phoenix web framework. For the mobile app, there really wasn’t much of a choice. I wasn’t about to start learning Android and iOS development. I already knew (or thought I knew) JavaScript so building it in React Native should be pretty straightforward, right? Heh.
I did manage to get a basic app up and running fairly quickly. I started out building the backend and a basic web app with Phoenix. Then I started working on building a mobile version of the app. I could have just built it as a “progressive web app” but my friend said that he really wanted it to be a “real” mobile app, and that it should ultimately be in the Apple and Google app stores. Okay, so React Native it is. (I did look at Flutter but that would mean learning the Dart language, and it didn’t have nearly as much mindshare as React Native.)
By the time I started my new job in July of 2019, I had a simple mobile app with a login screen and a screen that showed the user’s list of cars they might want to bid on. The login functionality just used a JSON REST endpoint served by my Phoenix backend, and the list of vehicles was pulled via web socket using a Phoenix Channel. This was kind of silly because I didn’t really need a real-time list for this data (the user is the only one who could change it) but again, I wanted to learn new things as much as build a functioning app. And eventually I would want the app to get real-time updates about a vehicle as it comes up for auction so down the road the socket stuff could come in handy.
Another technology that I was interested in at the time was GraphQL. Couldn’t hurt to get some exposure to this red-hot technology, right? So, last year I replaced the REST login endpoint with a GraphQL mutation (and another for signup). I used the Absinthe Elixir GraphQL library on the backend and the Apollo Client on the frontend. So far so good.
Seeing as I was now working full-time, I only got around to working on Bravauto on the occasional Saturday. There were long stretches of time without working on the app at all. And, one thing that is every bit as certain as having bugs in your code is the language and libraries you use getting updates. It seemed like every time I came back to the app there was something to update!
One of my New Year’s resolutions for 2021 was to start pushing myself to work on Bravauto more consistently. I figured a good place to start would be to update libraries. I updated to the latest version of Elixir, Phoenix, and Absinthe. I updated Expo, React, and React Native, dealing with the usual odd issues and frustrations as they came up. I updated to Apollo 3, which brought cool new features, but was also a pretty dramatic change. Then I got around to updating to React Navigation 5. This required some rework as well. It seems one thing you can count on in the software world is having to do some pretty significant rewrites to keep your code up to date!
An Update… and An Important New Feature
I started working on updating to React Navigation 5 around mid-January. Even though this new version meant some pretty dramatic changes, I didn’t think it would be too much work for me. After all, I only had the login and registration screens and then the main screen, which shows the list of vehicle evaluations. Back when I started working on the app, I used the Expo generator to get an app with basic navigation, including a tabbed interface. So it was a little more complex than just the three screens, but still, I didn’t think it would be too hard.
I made the required package changes and replaced createAppContainer
with using a NavigationContainer
. I added my Stack.Navigator
and Stack.Screen
components. I had also been meaning to improve the user experience by not requiring login every time you start the app. I was already storing the user’s token in AsyncStorage
, so when the app starts up it should look in storage for the token, and if we have it, not make the user log in again. I figured I might as well take care of that improvement as long as I was messing with the flow between screens.
I found some helpful blog posts on this, but the official React Navigation docs has a handy page on the subject as well. Following that example, I added a reducer in my App.js:
const [state, dispatch] = React.useReducer(
(prevState, action) => {
switch (action.type) {
case "RESTORE_TOKEN":
return {
...prevState,
token: action.token,
isLoading: false,
};
case "SIGN_IN":
return {
...prevState,
isSignout: false,
token: action.token,
};
case "SIGN_OUT":
Auth.clearToken();
return {
...prevState,
isSignout: true,
token: null,
};
}
},
{
isLoading: true,
isSignout: false,
token: null,
}
);
And then a useEffect
call to look for the token when we start up:
React.useEffect(() => {
const getAuthToken = async () => {
let token = await Auth.getToken();
if (token) dispatch({ type: "RESTORE_TOKEN", token: token });
};
getAuthToken();
}, []);
And whereas the old post-login behavior was to navigate to the main screen explicitly (props.navigation.navigate("Main")
), I followed the suggested approach of rendering the appropriate screen(s) based on having a token or not:
<Stack.Navigator>
{state.token ? (
<Stack.Screen name="Main" component={MainTabNavigator} />
) : (
<>
<Stack.Screen name="Registration" component={RegistrationScreen} />
<Stack.Screen name="Login" component={LoginScreen} />
</>
)}
</Stack.Navigator>
If we have the user’s token we render the MainTabNavigator
, otherwise we render the registration and login screens (which have links to each other). Simple right? Well, here’s where the pain begins!
Descent Into Madness
After login, instead of seeing my main screen with its list of vehicles, I got the dreaded React Native red error screen. The app was going into a re-render loop until React killed it. Basically, in that piece of code above it was taking the “else” part of the ternary and rendering the login screen (I had removed the registration screen while debugging this) and since the login screen already had the user data, it called the function to set the token again, which re-rendered the App again, and so on.
So, I did what any programmer would do and started debugging the issue. Being a mostly full-time Elixir dev now, I started sticking in console.log
statements to check the values of my variables. (Debugging by logging is much more common in the Elixir world than using a step-debugger.) The output of a console.log
immediately before this Stack.Navigator
component showed that it had the token! What? How can a JavaScript ternary have a truthy value but take the “else” branch?!?
I figured out how to set up a step debugger for React Native and stepped through the code. Stopping on the console.log
, I could clearly see state.token
had the token value. Then I advanced one step and it went into the true branch of the ternary! That was reassuring in a way: JavaScript itself was not broken! But now I was stopped on a line that should be rendering the MainTabNavigator
and yet the next thing executed is code in the LoginScreen
! Instead of solving my problem, debugging has made it even more bizarre.
At this point, I was completely stumped and more stuck than ever. What do I do now? To be honest, this makes it sound like this was all in one coding session. What actually happened was that I got this issue and tried all sorts of things, basically flailing around, trying to get it to work right. At one point, I asked my brother – who is a React teacher – for help. He knows React but doesn’t like React Native and couldn’t figure out what was wrong. I went away and came back to it a few times.
Then I started asking for help in forums and chat. I posted a question on StackOverflow and asked in the React Native Discord and Slack channels. In one of those channels, someone advised me to be very wary of trying to debug (with console.log
or a debugger) React due to React’s asynchronous nature. He explained that you can see a value logged, but when it is actually used it is something else. Nasty. And this seemed like it could be the case for me. At least I was seeing a state.token
variable with a value and it appeared to take the true branch of a ternary but the result of that was clearly the false branch.
Finally, a helpful soul in a React Native Slack channel pointed out that my Apollo GraphQL login mutation had an issue. Apollo now provides a React hook called useMutation
which returns a function you can call to invoke the mutation, as well as an object with properties that will have values at various times: data
will get the results of the mutation, error
will get the error message if one occurs, and loading
will be true while the mutation is running and false otherwise:
const [signin, { data, error, loading }] = useMutation(SIGNIN_MUTATION);
This lets you do something like:
if (loading) {
// render spinner
}
So of course, I also did this:
if (data) {
storeAuthToken(data.signin.token);
setAuth(data.signin.token);
}
If we get the data back from the mutation call we store the token and then call the setAuth
function that was passed to this form via React Context. Seems logical right? And the logging and the debugger suggested that this was working correctly. But I got the infinite render loop. I was also getting a warning from React: Cannot update a component from inside the function body of a different component.
All the searching in the world couldn’t turn up a relevant explanation for that or my infinite loop.
At Last, Salvation!
So what my helpful Slack friend pointed out is that the Apollo useMutation
hook also takes an optional onCompleted
callback:
onCompleted A callback executed once your query successfully completes.
I had actually seen this in the docs at one point! But the fact that I could log the value of data
inside that if
block had me totally fooled that that piece was working fine. It was not. Once I moved the code inside the if (data) ...
block into the onCompleted
callback it worked great! And no more warning either:
const [signin, { error, loading }] = useMutation(SIGNIN_MUTATION, {onCompleted: (data) => {
storeAuthToken(data.signin.token);
setAuth(data.signin.token);
}});
I was pretty annoyed that some example somewhere had shown destructuring data
from the return value of useMutation
! Why would you show that if it’s not safe?! Then I realized that the example that I saw taking that approach used the data
variable in the JSX/render portion of the component! I’m not 100% positive, but I think that trying to use that variable in the pre-render logic (setting state in the parent App component) caused the bizarre behavior, but if I’d just wanted to render the token (or, for when I get to the vehicle list screen and pull that data with the useQuery
hook), it is fine to just use data
in the JSX. We shall see when I start working on that screen, which should be coming up soon.
This was probably the weirdest bug I’ve ever come across. It was confounding and made me seriously consider giving up on this project altogether. A classic gumption trap! I am glad I got past it though. Not just because I can move forward on my app, but because it was a painful and powerful lesson in the asynchronous nature of React. The code in a component outside of the return
is a very different world from the pseudo-HTML inside that render block. Hopefully now I can start knocking out new features and get somewhere with this app!
I have one final thought on this bug. Besides being a painful lesson in respecting the asynchronous nature of React, I feel that it is a powerful warning about React and JavaScript in general. I still don’t know exactly how my app was able to invoke the code in the false branch of a ternary. I probably never will. This problem was solved not by debugging and seeing the problem, but by a suggestion to try another approach. I understand that somehow that data
variable was truthy enough to cause my setAuth
function to get called and trigger a re-render. But I don’t fully grok how it went sideways from there. I’m willing to just accept that Apollo fires the onCompleted
callback at the correct time, but I’m not really happy about it. Any programming language and library/framework that makes it this easy to shoot yourself in the foot is cause for extreme caution. React Native is a necessary evil for me to achieve my goal, but from here on out, I consider it just that: evil. Your mileage may vary.
Comments:
Send me a comment!
(Comments are manually added, so be patient.)