[MS] The case of the crash on a null pointer even though we checked it for null - devamazonaws.blogspot.com
A colleague was investigating a crash. The stack at the point of the crash looks like this:
contoso!winrt::impl::consume_Windows_Foundation_Collections_IVectorView< winrt::Windows::Foundation::Collections::IVectorView<int>, int>::Size+0x30 contoso!winrt::Contoso::implementation::Widget:: InitializeNodesAsync$_ResumeCoro$1+0x2bc contoso!wil::details::coro::apartment_resumer::resume_apartment_callback+0x28 combase!CRemoteUnknown::DoCallback+0x34 combase!CRemoteUnknown::DoNonreentrantCallback+0x48 rpcrt4!Invoke+0x64 rpcrt4!InvokeHelper+0x130 rpcrt4!Ndr64StubWorker+0x6cc rpcrt4!NdrStubCall3+0xdc combase!CStdStubBuffer_Invoke+0x6c combase!ObjectMethodExceptionHandlingAction<⟦...⟧>+0x48 combase!DefaultStubInvoke+0x2b8 combase!SyncServerCall::StubInvoke+0x40 combase!StubInvoke+0x170 combase!ServerCall::ContextInvoke+0x3c4 combase!ReentrantSTAInvokeInApartment+0x1fc combase!ComInvokeWithLockAndIPID+0xcc4 combase!ThreadDispatch+0x514 combase!ThreadWndProc+0x1b4 user32!UserCallWinProcCheckWow+0x180 user32!DispatchMessageWorker+0x130
If we look at the point of the fault:
0:001> r Last set context: x0=0000000000000000 x1=00000053af4fdf78 x2=0000017b4f7e6380 x3=0000000000000000 x4=6597e92abf947185 x5=857194bf2ae99765 x6=857194bf2ae99765 x7=0000017b4f700000 x8=00007ff85b11ce40 x9=0000000000000001 x10=0000006700000000 x11=0000000000000000 x12=fffffffffff00000 x13=0000000000000000 x14=0000000000000000 x15=000000000000000b x16=00007ff8d4fd44a0 x17=ffff68553f08a1c6 x18=00007ff8d0bc0000 x19=0000017b4c834de0 x20=0000000000000000 x21=00007ff8d45cd188 x22=00007ff8d45cd188 x23=00007ff8d45cd150 x24=0000017b31196930 x25=00000053af4fdfa0 x26=00000053af4fe580 x27=0000000000000010 x28=0000000000000000 fp=00000053af4fdf90 lr=00007ff85af448cc sp=00000053af4fdf60 pc=00007ff85af448f0 psr=80001040 N--- EL0 contoso!winrt::impl::consume_Windows_Foundation_Collections_IVectorView< winrt::Windows::Foundation::Collections::IVectorView<int>, int>::Size+0x30: 00007ff8`5af448f0 f9400008 ldr x8,[x0]
we see that we are crashing on a null pointer (x0).
The function is a C++/WinRT consume_
, which is just a projection of the underlying COM call. The COM call is performed by reading the vtable pointer from the object, reading the function pointer from the vtable, and then calling the function.
The fact that we are reading from x0 (the inbound and outbound parameter slot) means that this is almost certainly reading the vtable pointer from the object: We want the COM pointer in x0 for the outbound call, so the obvious thing to do is to leave it there while you read the vtable pointer from it.
We can confirm this by reading the disassembly.
0:001> u .-30 . contoso!winrt::impl::consume_Windows_Foundation_Collections_IVectorView< winrt::Windows::Foundation::Collections::IVectorView<int>, int>::Size+0x30: 00007ff8`5af448c0 stp fp,lr,[sp,#-0x10]! ; build stack frame 00007ff8`5af448c4 mov fp,sp 00007ff8`5af448c8 bl contoso!__security_push_cookie (00007ff8`5ab91050) 00007ff8`5af448cc sub sp,sp,#0x20 00007ff8`5af448d0 mov w8,#0x1E4 00007ff8`5af448d4 ldr x0,[x0] ; fetch the COM pointer 00007ff8`5af448d8 str wzr,[sp,#0x18] 00007ff8`5af448dc str w8,[sp] 00007ff8`5af448e0 adrp x8,contoso!`string'+0x10 (00007ff8`5b11c000) 00007ff8`5af448e4 add x8,x8,#0xE40 00007ff8`5af448e8 stp x8,xzr,[sp,#8] 00007ff8`5af448ec add x1,sp,#0x18 00007ff8`5af448f0 ldr x8,[x0] ; read the vtable
(The other code is recording the line number and file name for diagnostic purposes.)
Okay, so we are calling IVectorView<T>::Size
on a null pointer.
Let's see whose idea that is.
Here's the caller:
winrt::IAsyncAction Widget::InitializeNodesAsync() { auto lifetime = get_strong(); std::optional<winrt::IVectorView<int32_t>> numbers; co_await winrt::resume_background(); CallWithRetry([&] { numbers = GetMagicNumbers(); }); if (numbers == nullptr) { co_return; } co_await winrt::resume_foreground(m_uithread); std::vector<winrt::Node> nodes; nodes.reserve((*numbers).Size()); // ← CRASH HERE
The crash inside consume_
tells us that (*numbers)
is null. Let's see if we can confirm that in the debugger.
First, we have to find numbers
.
0:001> dv /V @x19 @x19 __coro_frame_ptr = 0x0000017b`4c834de0 00000000`00000088 @x25+0x0590 lifetime = struct winrt::com_ptr< winrt::Contoso::implementation::Widget> 00000000`000000e8 @x25+0x0590 nodes = class std::vector<int> 00000000`000000d8 @x25+0x0590 numbers = class std::optional<winrt::Windows::Foundation:: Collections::IVectorView<int> > ⟦ ... ⟧
The debugger says that numbers
is at @x25+0x0590
, and that this calculates out to 00000000`000000d8
, which is nonsense. So we can't really trust that calculation.
Let's see what the code uses. We disassemble backward from the return address.
0:001> k2 Child-SP RetAddr Call Site 00000053`af4fdf60 00007ff8`5b059c8c contoso!winrt::impl::consume_Windows_Foundation_Collections_IVectorView< winrt::Windows::Foundation::Collections::IVectorView<int>, int>::Size+0x30 00000053`af4fdfa0 00007ff8`5abc5108 contoso!winrt::Contoso::implementation::Widget:: InitializeNodesAsync$_ResumeCoro$1+0x2bc
We read the return address from the function one deeper on the stack, giving us 00007ff8`5b059c8c
.
00007ff8`5b059c84 add x0,x19,#0xD8 // ← setting up the call to consume_ 00007ff8`5b059c88 bl contoso!winrt::impl::consume_Windows_Foundation_Collections_IVectorView< winrt::Windows::Foundation::Collections::IVectorView<int>, int>::Size 00007ff8`5b059c8c uxtw x1,w0
From the disassembly, we see that the compiler stored the IVector
part of the numbers
at offset 0xD8
from the coroutine frame, which is in x19
.
We can pluck the coroutine frame from the dv
output, or we can ask the debugger to restore the nonvolatile registers for us (which includes x19
):
0:001> .frame /c 2
0:001> dps @x19+0xd8
0000017b4c834eb8 0000000000000000 // <<<<< the stored IVector
0000017b4c834ec0 0000017b4ff78400
We can ask the debugger for the layout of the std::optional
so we can see the full numbers
.
I copied the type name from the earlier dv
output.
0:001> dt contoso!std::optional<winrt::Windows::Foundation::Collections::IVectorView<int> > +0x000 _Dummy : std::_Nontrivial_dummy_type +0x000 _Value : winrt::Windows::Foundation::Collections::IVectorView<int> +0x008 _Has_value : Bool
Okay, so the value is kept at offset zero, and the _Has_value
is at offset 8.
We can eyeball from the earlier dps command that the _Value
is nullptr, and the _Has_value
is false. (Little-endian means that the single bool
is in the least significant byte of the 8-byte value.)
Or we can ask the debugger to interpret it for us.
0:001> dt contoso!std::optional<winrt::Windows::Foundation::Collections::IVectorView<int> > @x19+0xd8 +0x000 _Dummy : std::_Nontrivial_dummy_type +0x000 _Value : winrt::Windows::Foundation::Collections::IVectorView<int> +0x008 _Has_value : 0
Okay, so the numbers
has no value.
But wait, our code checked for that!
if (numbers == nullptr) { co_return; }
Why didn't that work?
Because that's not how std::optional
works.
The std::optional
is a sum type of T
with a special value called std::nullopt
. If you compare a std::optional
against anything that isn't std::nullopt
, then you are checking if the std::optional
has a value that matches the value you are comparing against.
std::optional holds |
compared with | |
---|---|---|
std::nullopt |
Y |
|
std::nullopt |
true | false |
X |
false | X == Y |
For the purposes of comparison, an empty std::optional
is treated as if it had the value std::nullopt
, which is a value distinct from the value values of T
.
Therefore, writing if (numbers == nullptr)
means "if numbers
has a value that is equal to nullptr
".
But in this case, numbers
has no value, so the comparison fails, and we fall through.
Then we dereference the *numbers
, which is specified to retrieve the wrapped value, and it is undefined behavior if the numbers
has no value.
In our case, the numbers
indeed has no value, so we have entered the world of undefined behavior. In practice, what happens is that we read whatever is in _Value
, and we saw in the debugger, that _Value
holds a null pointer. We then try to call Size()
on a null pointer and crash.
One fix is to change the test from if (numbers == nullptr)
to if (!numbers.has_value())
to ask whether the numbers
is empty.
But this is working too hard.
The use of std::optional*<T>
was itself unnecessary. There is already a natural empty value for IVector
, namely nullptr
. So we can declare numbers
as an IVector
, which default-initializes to nullptr
, and then check whether it is still nullptr
after we try (and possibly fail) to get a value.
winrt::IAsyncAction Widget::InitializeNodesAsync() { auto lifetime = get_strong(); winrt::IVectorView<int32_t> numbers; // remove std::optional co_await winrt::resume_background(); CallWithRetry([&] { numbers = GetMagicNumbers(); }); if (numbers == nullptr) { co_return; } co_await winrt::resume_foreground(m_uithread); std::vector<winrt::Node> nodes; nodes.reserve(numbers.Size()); // remove the * ⟦...⟧
This change also covers the case where GetMagicNumbers
succeeds but returns a null pointer.
In practice, GetMagicNumbers
never returns a null pointer because it knows that the empty set is not the same as no set at all. The original code was testing against something that never happens.
Post Updated on September 05, 2025 at 03:00PM
Thanks for reading
from devamazonaws.blogspot.com
Comments
Post a Comment