Home > IL2CPP > IL2CPP Tutorial: Finding loaders for obfuscated global-metadata.dat files

IL2CPP Tutorial: Finding loaders for obfuscated global-metadata.dat files

February 23, 2021 Leave a comment Go to comments

Game publishers are loving it lately. Over the last few months I’m starting to see all kinds of weird and wacky obfuscation schemes designed to prevent Il2CppInspector from loading IL2CPP games.

While it’s quite amusing to see narrowly targeted attacks percolate, it does make the support tickets pile up on the issue tracker, and I unfortunately have neither the time nor the desire to sit and pick apart every file thrown at me. The old adage of giving a person a tool and they’ll hack for a day, but teach a person to write tools and they’ll hopefully stop spamming your social media seems pertinent here. At least I think that’s how the adage goes… or was it fish something? Either way, we all started off as plankton; hopefully you are thirsty to become a shark!

In this tutorial, I’ll walk (swim?) you through how to find the loader for global-metadata.dat in almost any IL2CPP application so that you can reverse engineer it yourself. This will include obfuscated metadata, encrypted metadata, and metadata embedded in the binary itself, plus light obfuscation of the code path to the loader. I’ll also throw in a couple of examples to whet your appetite.

How do I know if global-metadata.dat is obfuscated?

  1. Open global-metadata.dat in a hex editor. Are the first four bytes AF 1B B1 FA? If so, this is a good sign that the file is not obfuscated – but not a guarantee.

    The start of an unobfuscated global-metadata.dat looks as follows – four magic bytes, a 4-byte version number and a list of file offsets and lengths for each table in the file – generally in sequential order.

If it doesn’t look similar to this, it’s probably obfuscated or encrypted.

  1. If you can’t find global-metadata.dat in its usual location (Android: assets/bin/Data/Managed/Metadata/global-metadata.dat, PC: <app-name>_Data/il2cpp_data/Metadata), scan all of the game files for any starting with the above magic bytes. If you find one, it’s probably a renamed global-metadata.dat. If you don’t, the metadata file is likely – but not guaranteed – to be embedded within the game binary itself or another file, or it may be encrypted and stored as one of the other files.
  1. Assuming you’ve found a metadata file with a valid header, try to load it into Il2CppInspector. If the file fails to load with errors such as An item with the same key has already been added, Sequence contains no matching element or Index was outside the bounds of the array, there are a few possibilities, in this order of likelihood:
    • An obfuscated or encrypted metadata file
    • A bug in Il2CppInspector
    • A version of IL2CPP not yet supported by Il2CppInspector. Check which version of Unity the game was made with. It may take us a while to implement support for very recent versions of Unity.

If you have now determined the metadata file is obfuscated or hidden, or you are not sure, proceed to the next step.

Metadata loader code path

First, the Unity player (Android: libunity.so, PC: UnityPlayer.dll) calls the function il2cpp_init on the main application binary, which in almost all cases is a regular export (Android: libil2cpp.so, PC: GameAssembly.dll – note these files can be renamed by the developers). The following code path is then followed:

il2cpp_init
  -> il2cpp::vm::Runtime::Init
    -> il2cpp::vm::MetadataCache::Initialize
      -> il2cpp::vm::MetadataLoader::LoadMetadataFile

These last three functions do not usually have their symbols included in the production binary, so you will need to trace a path through to them. By looking at the IL2CPP source code, we can get an idea of what we’re looking for (there is slight variation between versions). Don’t worry about reading all of this code immediately – I’ve provided sizeable snippets because we can use the lines of code around the calls as context – waypoints or milestones if you like – to help us find the correct call chain.

Info: This article on the blog explains the complete initialization process that loads the metadata and binary files.

il2cpp_init (located in il2cpp-api.cpp, comments elided):

int il2cpp_init(const char* domain_name)
{
    setlocale(LC_ALL, "");
    return Runtime::Init(domain_name, "v4.0.30319");
}

il2cpp::vm::Runtime::Init (located in vm/Runtime.cpp):

bool Runtime::Init(const char* filename, const char *runtime_version)
{
    SanityChecks();

    os::Initialize();
    os::Locale::Initialize();
    MetadataAllocInitialize();

    s_FrameworkVersion = framework_version_for(runtime_version);

    os::Image::Initialize();
    os::Thread::Init();
    il2cpp::utils::RegisterRuntimeInitializeAndCleanup::ExecuteInitializations();

    if (!MetadataCache::Initialize())
        return false;
    Assembly::Initialize();
    gc::GarbageCollector::Initialize();

    Thread::Initialize();
    Reflection::Initialize();

    register_allocator(il2cpp::utils::Memory::Malloc);

    memset(&il2cpp_defaults, 0, sizeof(Il2CppDefaults));

    const Il2CppAssembly* assembly = Assembly::Load("mscorlib.dll");

    il2cpp_defaults.corlib = Assembly::GetImage(assembly);
    DEFAULTS_INIT(object_class, "System", "Object");
    DEFAULTS_INIT(void_class, "System", "Void");
    DEFAULTS_INIT_TYPE(boolean_class, "System", "Boolean", bool);
    DEFAULTS_INIT_TYPE(byte_class, "System", "Byte", uint8_t);
    DEFAULTS_INIT_TYPE(sbyte_class, "System", "SByte", int8_t);
    DEFAULTS_INIT_TYPE(int16_class, "System", "Int16", int16_t);
    DEFAULTS_INIT_TYPE(uint16_class, "System", "UInt16", uint16_t);
    DEFAULTS_INIT_TYPE(int32_class, "System", "Int32", int32_t);
    DEFAULTS_INIT_TYPE(uint32_class, "System", "UInt32", uint32_t);
    DEFAULTS_INIT(uint_class, "System", "UIntPtr");
    DEFAULTS_INIT_TYPE(int_class, "System", "IntPtr", intptr_t);
    DEFAULTS_INIT_TYPE(int64_class, "System", "Int64", int64_t);
    DEFAULTS_INIT_TYPE(uint64_class, "System", "UInt64", uint64_t);
    DEFAULTS_INIT_TYPE(single_class, "System", "Single", float);
    DEFAULTS_INIT_TYPE(double_class, "System", "Double", double);
    DEFAULTS_INIT_TYPE(char_class, "System", "Char", Il2CppChar);
    DEFAULTS_INIT(string_class, "System", "String");
    // ...

il2cpp::vm::MetadataCache::Initialize (located in vm/MetadataCache.cpp, comments elided):

bool il2cpp::vm::MetadataCache::Initialize()
{
    s_GlobalMetadata = vm::MetadataLoader::LoadMetadataFile("global-metadata.dat");
    if (!s_GlobalMetadata)
        return false;

    s_GlobalMetadataHeader = (const Il2CppGlobalMetadataHeader*)s_GlobalMetadata;
    IL2CPP_ASSERT(s_GlobalMetadataHeader->sanity == 0xFAB11BAF);
    IL2CPP_ASSERT(s_GlobalMetadataHeader->version == 24);

    s_TypeInfoTable = (Il2CppClass**)IL2CPP_CALLOC(s_Il2CppMetadataRegistration->typesCount, sizeof(Il2CppClass*));
    s_TypeInfoDefinitionTable = (Il2CppClass**)IL2CPP_CALLOC(s_GlobalMetadataHeader->typeDefinitionsCount / sizeof(Il2CppTypeDefinition), sizeof(Il2CppClass*));
    s_MethodInfoDefinitionTable = (const MethodInfo**)IL2CPP_CALLOC(s_GlobalMetadataHeader->methodsCount / sizeof(Il2CppMethodDefinition), sizeof(MethodInfo*));
    s_GenericMethodTable = (const Il2CppGenericMethod**)IL2CPP_CALLOC(s_Il2CppMetadataRegistration->methodSpecsCount, sizeof(Il2CppGenericMethod*));
    s_ImagesCount = s_GlobalMetadataHeader->imagesCount / sizeof(Il2CppImageDefinition);
    s_ImagesTable = (Il2CppImage*)IL2CPP_CALLOC(s_ImagesCount, sizeof(Il2CppImage));
    s_AssembliesCount = s_GlobalMetadataHeader->assembliesCount / sizeof(Il2CppAssemblyDefinition);
    s_AssembliesTable = (Il2CppAssembly*)IL2CPP_CALLOC(s_AssembliesCount, sizeof(Il2CppAssembly));
    // ...

il2cpp::vm::MetadataLoader::LoadMetadataFile (located in vm/MetadataLoader.cpp):

void* il2cpp::vm::MetadataLoader::LoadMetadataFile(const char* fileName)
{
    std::string resourcesDirectory = utils::PathUtils::Combine(utils::Runtime::GetDataDir(), utils::StringView<char>("Metadata"));

    std::string resourceFilePath = utils::PathUtils::Combine(resourcesDirectory, utils::StringView<char>(fileName, strlen(fileName)));

    int error = 0;
    os::FileHandle* handle = os::File::Open(resourceFilePath, kFileModeOpen, kFileAccessRead, kFileShareRead, kFileOptionsNone, &error);
    if (error != 0)
    {
        utils::Logging::Write("ERROR: Could not open %s", resourceFilePath.c_str());
        return NULL;
    }

    void* fileBuffer = utils::MemoryMappedFile::Map(handle);

    os::File::Close(handle, &error);
    if (error != 0)
    {
        utils::MemoryMappedFile::Unmap(fileBuffer);
        fileBuffer = NULL;
        return NULL;
    }

    return fileBuffer;
}

All we usually have to do is bust out our decompiler and navigate through these functions. The latter two are of most interest: il2cpp::vm::MetadataLoader::LoadMetadataFile takes the filename of the metadata file and maps it into memory, while il2cpp::vm::MetadataCache::Initialize calls this function and stores the pointer to the mapped file in a static global variable, then begins reading data structures from it. Decryption and deobfuscation typically occurs in one or both of these functions, so we will want to compare them closely with the original source code for changes.

Finding the metadata loader: Simplest case

If the metadata file is called global-metadata.dat and this string is not encrypted in the binary, we can cruise through on easy mode. Simply search for the filename string, search for cross-references to the string address, and the instruction you find will usually be in il2cpp::vm::MetadataCache::Initialize.

In IDA, you can do this as follows:

  1. Press Shift+F12 to generate a list of all the strings in the file
  2. Press Ctrl+F and type global-metadata.dat. There will likely be one match
  3. Double-click on the match
  4. Click on the label and press X to generate a list of cross-references:
  1. Press Enter to follow the first and likely only cross-reference (or double-click on the desired reference if there are more than one)
  2. Press F5. You will now be in il2cpp::vm::MetadataCache::Initialize

Of course, we cannot rely on this string being available in plaintext if the developers have chosen to encrypt the strings to prevent this easy search – or at all if a different filename is used or if the metadata is embedded in the binary or another file – in which case, we will proceed to trace the code path.

Finding the metadata loader: Tracing an unobfuscated code path

The next simplest case involves no obfuscation of the actual code path to the loader (although the loader itself may be obfuscated). After having navigated to il2cpp_init in IDA and invoked the decompiler, here is an example trace which is typical of most IL2CPP binaries (this example is from an ubobfuscated version of Fall Guys):

__int64 __fastcall il2cpp_init(__int64 a1)
{
  __int64 v1; // rbx

  v1 = a1;
  setlocale(0, Locale);
  return (unsigned __int8)sub_18025B340(v1);
}

The call to sub_18025B340 is almost certainly il2cpp::vm::Runtime::Init – we rename it and click through:

__int64 __fastcall il2cpp::vm::Runtime::Init(__int64 a1)
{
  v1 = a1;
  v66 = &unk_182B7C088;
  sub_180225A60(&unk_182B7C088);
  v2 = dword_182B7C338++;
  if ( v2 > 0 )
  {
    v3 = 1;
    goto LABEL_125;
  }
  sub_180225480();
  sub_180225460();
  v4 = (__int64 *)operator new(0x10ui64);
  v64 = v4;
  if ( v4 )
    v5 = (void *)sub_18028BD70(v4, 0x80000i64);
  else
    v5 = 0i64;
  qword_182B7C298 = v5;
  v6 = (__int64 *)operator new(0x10ui64);
  v64 = v6;
  if ( v6 )
    v7 = (void *)sub_18028BD30(v6);
  else
    v7 = 0i64;
  qword_182B7C2A0 = v7;
  v8 = (__int64 *)operator new(0x10ui64);
  v64 = v8;
  if ( v8 )
    v9 = (void *)sub_18028BD30(v8);
  else
    v9 = 0i64;
  qword_182B7C2A8 = v9;
  qword_182B7C398 = (__int64)"4.0";
  sub_1802253C0();
  sub_180225310();
  sub_18028C380();
  if ( !sub_18025CEF0() )
  {
    --dword_182B7C338;
    v3 = 0;
    goto LABEL_125;
  }
  sub_180293880();
  sub_180227740((__int64)sub_180274290);
  v10 = (__int64 *)operator new(0x18ui64);
  v64 = v10;

This is just the start of the function and we have to wade through a bunch of junk, but we can use waypoints to help us. Notice the assignment of the value 4.0 in line 37 – this is the .NET Framework version which was assigned in the source code above:

    s_FrameworkVersion = framework_version_for(runtime_version);

    os::Image::Initialize();
    os::Thread::Init();
    il2cpp::utils::RegisterRuntimeInitializeAndCleanup::ExecuteInitializations();

    if (!MetadataCache::Initialize())
        return false;

Notice this is followed up by three function calls, followed by the call to MetadataCache::Initialize in the if statement. This pattern is identical in both the source code and decompilation, so we can surmise that sub_18025CEF0 is probably the target function, rename it and click through again:

char il2cpp::vm::MetadataCache::Initialize()
{
  v0 = sub_180261550("global-metadata.dat");
  *&xmmword_182B7C2D8 = v0;
  if ( v0 )
  {
    *(&xmmword_182B7C2D8 + 1) = v0;
    qword_182B7B948 = j_j__calloc_base(*(qword_182B7C2C0 + 48), 8i64);
    qword_182B7B950 = j_j__calloc_base(*(*(&xmmword_182B7C2D8 + 1) + 164i64) / 0x5Cui64, 8i64);
    qword_182B7B958 = j_j__calloc_base(*(*(&xmmword_182B7C2D8 + 1) + 52i64) >> 5, 8i64);
    qword_182B7B968 = j_j__calloc_base(*(qword_182B7C2C0 + 64), 8i64);
    dword_182B7B970 = *(*(&xmmword_182B7C2D8 + 1) + 172i64) / 0x28ui64;
    qword_182B7B978 = j_j__calloc_base(dword_182B7B970, 80i64);
    dword_182B7B980 = *(*(&xmmword_182B7C2D8 + 1) + 180i64) / 0x44ui64;
    qword_182B7B988 = j_j__calloc_base(dword_182B7B980, 96i64);
    v1 = *(&xmmword_182B7C2D8 + 1);

Tip: I’ve disabled casts in the code snippet above for readability. You can do this in IDA by pressing \ (backslash) in the decompiler window.

Here we clearly see at the very start that sub_180261550 corresponds to il2cpp::vm::MetadataLoader::LoadMetadataFromFile, and furthermore the resulting pointer v0 is stored in xmmword_182B7C2D8 – this is the static global storing the pointer to the memory-mapped metadata file (this may also be a dword or a qword depending on the architecture of the file you’re reverse engineering). This last point is very important because all accesses to the metadata by the application will occur via this pointer, so if there is any just-in-time deobfuscation or decryption to be performed, we will be able to find it by searching to references to this pointer (just-in-time means that the deobfuscation is performed just before the data is used, rather than when the file is loaded; this has the advantage of not leaving deobfuscated data lying around in memory, at the expense of slightly reduced performance). Generally, however, the applications I’ve encountered perform the decryption before any accesses, either at the start of the function above or in il2cpp::vm::MetadataLoader::LoadMetadataFromFile.

Finding the metadata loader: What if there is no il2cpp_init?

You may come across files with no il2cpp_init export. The Unity player must somehow call into the main application binary, so we can resort to looking in UnityPlayer.dll or libunity.so to find this entry point. The Unity players do not have source code available, but they are relatively easy to follow and we can use an unmodified copy of the player for reference. Additionally, you can easily create an empty Unity project and enable PDB generation, enabling you to see all of the function names and other symbols in the disassembly of the player.

When there is no il2cpp_init export, there are a few immediate possibilities:

  • The export name is obfuscated / encrypted
  • The player calls a different export to perform the initialization
  • The init function’s RVA (relative virtual address) is hardcoded in the player
  • The player calls an export which retrieves the function’s address from the application binary
  • The application binary calls an export on the player in its load hooks when the OS loads the file to provide the function address

There are two main points of interest in the player: where the reference to il2cpp_init is acquired, and where it is called. The normal flow of execution in an unobfuscated player looks like this:

UnityMainImpl is rather long with a great many function calls, but we can use various string literals to guide the way:

      winutils::DisplayErrorMessagesAndQuit("Data folder not found");
    }
    DetectIL2CPPVersion();
    v78.m_data = 0i64;
    v78.m_size = 0i64;
    v78.m_label.identifier = 68;
    v78.m_internal[0] = 0;
    core::StringStorageDefault<char>::assign(&v78, "GameAssembly.dll", 0x10ui64);
    v27 = !LoadIl2Cpp(&v78);
    if ( v78.m_data && v78.m_capacity > 0 )
      operator delete(v78.m_data, v78.m_label);
    if ( v27 )
      winutils::DisplayErrorMessagesAndQuit("Failed to load il2cpp");
    v78.m_data = 0i64;
    v78.m_size = 0i64;
    v78.m_label.identifier = 68;
    v78.m_internal[0] = 0;
    core::StringStorageDefault<char>::assign(&v78, "il2cpp_data", 0xBui64);

This decompilation is from a player with symbols via a PDB file, but we can easily search a non-annotated player binary for string literals such as Failed to load il2cpp (or the others shown above) and move around until we find LoadIl2Cpp.

LoadIl2Cpp itself is quite straightforward and contains dozens of these:

   v2 = 1;
    il2cpp_init = LookupSymbol(v1, "il2cpp_init", kSymbolRequired);
    if ( !il2cpp_init )
    {
      v2 = 0;
      printf_console("il2cpp: function il2cpp_init not found\n");
    }
    il2cpp_init_utf16 = LookupSymbol(gIl2CppModule, "il2cpp_init_utf16", kSymbolRequired);
    if ( !il2cpp_init_utf16 )
    {
      v2 = 0;
      printf_console("il2cpp: function il2cpp_init_utf16 not found\n");
    }
    il2cpp_shutdown = LookupSymbol(gIl2CppModule, "il2cpp_shutdown", kSymbolRequired);
    if ( !il2cpp_shutdown )
    {
      v2 = 0;
      printf_console("il2cpp: function il2cpp_shutdown not found\n");
    }

As you can see, each export is looked up in the loaded binary file and its addressed stored in a series of static global function pointers (il2cpp_init, il2cpp_init_utf16 etc.). This is the acquisition phase, so if the export is not present, look for changes here to see if a different export is called, or some other code is executed. Once again, if the strings are not obfuscated, we can search on these to find LoadIl2Cpp more easily.

What if there is no import? InitializeIl2CppFromMain looks something like this:

char __fastcall InitializeIl2CppFromMain(const core::basic_string<char,core::StringStorageDefault<char> > *monoConfigPath, const core::basic_string<char,core::StringStorageDefault<char> > *dataPath, int argc, const char **argv)
{
  v4 = argv;
  v5 = argc;
  v6 = dataPath;
  v7 = monoConfigPath;
  RegisterAllInternalCalls();
  il2cpp_runtime_unhandled_exception_policy_set(IL2CPP_UNHANDLED_POLICY_LEGACY);
  il2cpp_set_commandline_arguments(v5, v4, 0i64);
  v8 = v7->m_data;
  if ( !v7->m_data )
    v8 = &v7->8;
  il2cpp_set_config_dir(v8);
  v9 = v6->m_data;
  if ( !v6->m_data )
    v9 = &v6->8;
  il2cpp_set_data_dir(v9);
  v10 = GetMonoDebuggerAgentOptions(&result, 0);
  v11 = v10->m_data;
  if ( !v10->m_data )
    v11 = &v10->8;
  il2cpp_debugger_set_agent_options(v11);
  if ( result.m_data && result.m_capacity )
    operator delete(result.m_data, result.m_label);
  il2cpp_init("IL2CPP Root Domain");
  il2cpp_set_config("unused_application_configuration");
  profiling::ScriptingProfiler::Initialize();
  return 1;
}

There is considerable variance between versions since the Unity developers add new APIs like we’re about to have a world shortage. What they all have in common is the call to il2cpp_init, and the strings IL2CPP Root Domain and unused application configuration are a giveaway to finding this function even if the code path to it is obfuscated.

If il2cpp_init wasn’t imported in LoadIl2Cpp, it’s quite likely there will be a change in the above code to execute the initialization call. If the call looks the same as above, it’s quite likely that the il2cpp_init function pointer has been set elsewhere in the player.

Once we’ve determined the RVA of il2cpp_init or its equivalent, we can once again backtrack to the main application binary and trace our way through the code path as described earlier.

Real-world examples

Now we have a rough understanding of how it all hangs together on paper, let’s have some fun and look at some real examples found in the wild! Note that I’m not going to explain the actual deobfuscation here – indeed, I haven’t deobfuscated all of them anyway; the idea is to give you a starting point and help you recognize what you’re looking for. I highlight a few simple techniques below that you can use to find the code of interest.

Example 1: No strings, alternative il2cpp_init export (League of Legends: Wild Rift)

We search for the string global-metadata.dat and other strings from the initialization code path in the binary but don’t find anything, so we look for il2cpp_init to trace the code path.

This application does not have an il2cpp_init export, so we look at the exports list:

A quick glance shows that the export names are encoded with ROT-5, so nq2huu_nsny is the export il2cpp_init. If we don’t spot this encoding, we could also find the function by examining LoadIl2Cpp in libunity.so to find the loaded symbols.

We trace the code path down to Runtime::Init as normal:

  v2 = sub_18FB4A0();
  v3 = nullsub_1(v2);
  v4 = sub_18F6C88(v3);
  qword_79B75D8 = (__int64)"4.0";
  v5 = sub_18F8B04(v4);
  sub_18D63F8(v5);
  sub_18F3F2C();
  sub_188A6EC();
  v7 = nullsub_3(v6);
  v8 = sub_18D52B4(v7);
  v9 = sub_18B15BC(v8);
  sub_18A9B54(v9);
  sub_18F7B70(nq2huu_fqqth_0);
  memset(&qword_79B72B0, 0, 0x310uLL);
  v10 = sub_18E49BC("mscorlib.dll");
  qword_79B72B0 = nq2huu_fxxjrgqd_ljy_nrflj_0(v10);
  qword_79B72B8 = nq2huu_hqfxx_kwtr_sfrj_0(qword_79B72B0, "System", "Object");
  qword_79B72C8 = nq2huu_hqfxx_kwtr_sfrj_0(qword_79B72B0, "System", "Void");
  qword_79B72D0 = nq2huu_hqfxx_kwtr_sfrj_0(qword_79B72B0, "System", "Boolean");

We click on each function in the highlighted names until we find one that looks like MetadataCache::Initialize. This turns out to be sub_188A6EC:

void sub_188A6EC()
{
  sub_18F3B04((__int64)aV, aV, 0x14u, dword_76D6C74, dword_76D6C74, 0LL);
  qword_79B7160 = sub_18F5C34(aV);
  qword_79B7168 = qword_79B7160;
  qword_79B7170 = sub_18F4F60(*(int *)(qword_79B7150 + 64), 8LL);
  qword_79B7178 = sub_18F4F60(*(int *)(qword_79B7168 + 164) / 0x64uLL, 8LL);
  qword_79B7180 = sub_18F4F60(*(int *)(qword_79B7168 + 52) / 0x34uLL, 8LL);
  qword_79B7188 = sub_18F4F60(*(int *)(qword_79B7150 + 48), 8LL);
  dword_79B7190 = *(int *)(qword_79B7168 + 180) / 0x28uLL;
  v0 = &unk_79B7000;
  qword_79B7198 = sub_18F4F60(dword_79B7190, 72LL);
  dword_79B71A0 = *(int *)(qword_79B7168 + 188) / 0x44uLL;
  qword_79B71A8 = sub_18F4F60(dword_79B71A0, 96LL);

We can tell this is the correct function because of the various data structure accesses in the highlighted lines, which match those in the IL2CPP source code for this function.

Now we can home in on the three key lines at the start of the function:

  sub_18F3B04((__int64)aV, aV, 0x14u, dword_76D6C74, dword_76D6C74, 0LL);
  qword_79B7160 = sub_18F5C34(aV);
  qword_79B7168 = qword_79B7160;

We can initially assume that line 2 is the call to MetadataLoader::LoadMetadataFromFile with the fulename passed as aV. The unknown call in line 1 is not present in the original source file, the string literal global-metadata.dat has been replaced by aV, and the call in line 1 receives a pointer to this variable ((__int64)aV) as its first argument. We assume that line 1 decrypts the filename and line 2 passes it to the loader, storing it in qword_79B7160 and qword_79B7168 – our s_GlobalMetadata static pointer variable.

Info: Wild Rift also encrypts the binary with XOR encryption and the strings in global-metadata.dat, as well as rearranging the order of some metadata struct fields. Check out the full deobfuscation walkthrough here: Reverse Engineering Adventures: League of Legends Wild Rift (IL2CPP)

Example 2: Decoy global-metadata.dat file (Tale of Immortal / 鬼谷八荒)

A quick glance at global-metadata.dat reveals a bunch of non-sensical ASCII data. Initially, we suspect some form of encryption.

We search for the string global-metadata.dat in the application binary but come up blank, so we once again trace the code path from il2cpp_init to MetadataCache::Initialize. In this sample I have already renamed some symbols:

char il2cpp::MetadataCache::Initialize()
{
  if ( qword_182A02CC0
    && (qword_182A02CC8 = (__int64 (__fastcall *)(_QWORD))qword_182A02CC0(107i64),
        v0 = (__int64 (__fastcall *)(_QWORD))qword_182A02CC0(108i64),
        qword_182A02CD0 = v0,
        qword_182A02CC8)
    && v0 )
  {
    strcpy((char *)&v60, "game.dat");
    l_metadataFilePath = (const char *)&v60;
  }
  else
  {
    l_metadataFilePath = "../../resources.resource.resdata";
  }
  v2 = il2cpp::vm::MetadataLoader::LoadMetadataFile(l_metadataFilePath);
  s_GlobalMetadata = v2;
  v3 = 0;
  if ( v2
    || (v2 = il2cpp::vm::MetadataLoader::LoadMetadataFile("../../resources.resource.resdata"),
        s_GlobalMetadata = v2,
        qword_182A02CD0 = 0i64,
        v2) )
  {
    s_GlobalMetadataHeader = (Il2CppGlobalMetadataHeader *)v2;
    s_TypeInfoTable = IL2CPP_CALLOC(*(int *)(qword_182A03588 + 0x30), 8i64);
    s_TypeInfoDefinitionTable = IL2CPP_CALLOC(s_GlobalMetadataHeader->typeDefinitionsCount / 0x5Cui64, 8i64);
    s_MethodInfoDefinitionTable = IL2CPP_CALLOC((unsigned __int64)s_GlobalMetadataHeader->methodsCount >> 5, 8i64);
    s_GenericMethodTable = IL2CPP_CALLOC(*(int *)(qword_182A03588 + 0x40), 8i64);
    s_ImagesCount = s_GlobalMetadataHeader->imagesCount / 0x28ui64;
    s_ImagesTable = IL2CPP_CALLOC(s_ImagesCount, 0x50i64);

The highlighted code has been inserted by the developer. Without bothering to figure out what all the calls at the top do, we use our intuition to assume that development builds of the game use game.dat as the metadata file, and production builds use resources.resource.resdata – the provided global-metadata.dat file is a sneaky lie!

When we check this file in the game’s data folder, we find it does not resemble a global-metadata.dat file, however it is clearly being loaded, therefore we know there is additional encryption.

We take a look at MetadataLoader::LoadMetadataFile. Let’s first recap the important part of this function from the original source code:

    os::FileHandle* handle = os::File::Open(resourceFilePath, kFileModeOpen, kFileAccessRead, kFileShareRead, kFileOptionsNone, &error);
    if (error != 0)
    {
        utils::Logging::Write("ERROR: Could not open %s", resourceFilePath.c_str());
        return NULL;
    }

    void* fileBuffer = utils::MemoryMappedFile::Map(handle);

    os::File::Close(handle, &error);
    if (error != 0)
    {
        utils::MemoryMappedFile::Unmap(fileBuffer);
        fileBuffer = NULL;
        return NULL;
    }

Line 1 retrieves a handle to the file, lines 2-6 check that the file has been opened correctly, line 8 maps the file into memory, line 10 closes the file handle, and lines 11-16 undo the mapping if there has been an error.

Now let’s look at Tale of Immortal’s spin on this part of the same function, which again I have partially annotated:

handle = il2cpp::os::File::Open(resourceFilePath, 3, 1, 1u, 0, &error);
    v28 = handle;
    if ( error )
    {
      v29 = (const char *)resourceFilePath;
      if ( v51 >= 0x10 )
        v29 = (const char *)resourceFilePath[0];
      sub_1800CA220("ERROR: Could not open %s", v29);
      l_pDecryptedMetadata = 0i64;
    }
    else
    {
      hMappedFile = il2cpp::utils::MemoryMappedFile::Map(handle);
      il2cpp::os::File::Close(v28, &error);
      if ( error )
      {
        sub_1800C9DA0((__int64)hMappedFile);
        l_pDecryptedMetadata = 0i64;
      }
      else
      {
        metadataLength = calculateDecryptedMetadataLength((unkStruct *)hMappedFile);
        pMetadata = (char *)j_allocBytes((unsigned __int64)metadataLength);
        probably_memcpy(pMetadata, hMappedFile, (size_t)metadataLength);
        do
          ++l_lengthOfFirstKey;                 // 21
        while ( metadataFirstDecryptionKey[l_lengthOfFirstKey] );
        metadataLength_1 = (int)metadataLength;
        numBytesToCopy = (int)metadataLength - (int)l_lengthOfFirstKey;
        pDestBytes = j_securityMemoryAllocator((int)numBytesToCopy);

The developers have added an else clause to the file mapping error check which performs decryption (these symbols are my interpretation, in a real-world session you will have to figure out their meanings for yourself – start by naming the known variables to match those in the source code and proceed to pick it apart from there).

Info: You can check out the source code for an Il2CppInspector plugin that handles the truly bizarre encryption of early versions of Tale of Immortal here.

Example 3: No global-metadata.dat file (Guardian Tales / 가디언-테일즈)

This one got dumped on my desk a few days ago and has no global-metadata.dat file at all! The file must be stored somewhere else or embedded in the binary – an increasingly common practice to defeat automated reverse engineering.

By now, we know the drill: standard operating procedure, trace the path to MetadataCache::Initialize:

  result = vm::MetadataLoader::LoadMetadataFile();
  s_GlobalMetadata = (__int64)result;
  if ( result )
  {
    qword_6AF08E8 = (__int64)result;
    qword_6AF08F0 = sub_1B1E63C(*(int *)(qword_6AF08D0 + 48), 8LL);
    qword_6AF08F8 = sub_1B1E63C(*(int *)(qword_6AF08E8 + 164) / 0x5CuLL, 8LL);
    qword_6AF0900 = sub_1B1E63C((unsigned __int64)*(int *)(qword_6AF08E8 + 52) >> 5, 8LL);
    qword_6AF0908 = sub_1B1E63C(*(int *)(qword_6AF08D0 + 64), 8LL);

This seems normal except that no filename is passed to MetadataLoader::LoadMetadataFile. We drill down to the function:

_DWORD *vm::MetadataLoader::LoadMetadataFile()
{
  sub_1B1EF70((__int64)unkStruct);
  folder_and_error = "Resources";
  v20 = 9LL;
  v0 = *(_QWORD *)(unkStruct[0] - 24);
  *(_QWORD *)v18 = unkStruct[0];
  *(_QWORD *)&v18[8] = v0;
  il2cpp::utils::PathUtils::Combine(v18, &folder_and_error, &resourcesDirectory);
  sub_1B5D78C(unkStruct);
  *(_OWORD *)v18 = xmmword_52725E6;
  v18[0] = 109;
  v1 = 1LL;
  *(_OWORD *)&v18[11] = *(__int128 *)((char *)&xmmword_52725E6 + 11);
  do
    v18[v1++] ^= 0xFEu;
  while ( v1 != 26 );
  unkStruct[0] = (__int64)v18;
  unkStruct[1] = 26LL;
  v2 = *((_QWORD *)resourcesDirectory - 3);
  folder_and_error = resourcesDirectory;
  v20 = v2;
  il2cpp::utils::PathUtils::Combine(&folder_and_error, unkStruct, &resourceFilePath);
  LODWORD(folder_and_error) = 0;
  fileHandle = os::File::Open(&resourceFilePath, 3LL, 1LL, 1LL, 0LL, &folder_and_error);
  fileHandle_1 = fileHandle;
  if ( (_DWORD)folder_and_error )
  {
    utils::Logging::Write("ERROR: Could not open %s");
LABEL_7:
    v7 = 0LL;
    goto LABEL_8;
  }
  // ...

Again I’ve annotated the symbols according to the original source code. Note that I named folder_and_error this way because it serves two purposes – a pointer to a path string and a boolean error flag. This can occur as a result of compiler optimizations or incorrect decompilation.

It does appear at first glance that a file is opened from storage (lines 22 and 24). But which file?

We don’t need to understand all of this code’s precise functionality to work this out. We can deduce that unkStruct is probably some kind of struct since the decompiler indexes it like an array but the stored values don’t appear to be of the same type, so we rename it accordingly. A number of functions receive this as an argument, and we note that it is ultimately passed to PathUtils::Combine which means that the first entry ultimately points to a filename or pathname (line 22).

Lines 12-17 seem to perform some kind of trivial XOR decryption – a loop which XORs each byte in v18 with 0xFE – and we might deduce from this that the filename length is 26 characters due to the number of iterations of the while loop (line 16) combined with the fact the final result (v18) is stored as a pointer in the first entry of unkStruct (line 17).

Let’s rename v18 to filename and undefine the awkward xmmword_52725E6 so that IDA interprets it as a sequence of bytes instead, then decompile again:

  sub_1B1EF70((__int64)unkStruct);
  folder_and_error = "Resources";
  v20 = 9LL;
  v0 = *(_QWORD *)(unkStruct[0] - 24);
  *(_QWORD *)filename = unkStruct[0];
  *(_QWORD *)&filename[8] = v0;
  il2cpp::utils::PathUtils::Combine(filename, &folder_and_error, &resourcesDirectory);
  sub_1B5D78C(unkStruct);
  *(_OWORD *)filename = unk_52725E6;
  filename[0] = 0x6D;
  v1 = 1LL;
  *(_OWORD *)&filename[11] = unk_52725F1;
  do
    filename[v1++] ^= 0xFEu;
  while ( v1 != 26 );
  unkStruct[0] = (__int64)filename;
  unkStruct[1] = 26LL;

Lines 16-17 tell us that unkStruct is probably a two-element struct where the first element is a pointer to the filename and the second element is the filename length.

The rest of the code constructs the encrypted filename before running the decryption function on lines 11-15. Let’s reconstruct it:

  • Byte zero (the first character) is ASCII code 0x6D or m (line 10); the loop counter starts at 1 – not zero (line 12) so this character is not encrypted
  • Bytes 1-10 are set in line 9 to whatever unk_52725E6 is – this is an _OWORD assignment so 16 bytes are copied, but some are later overwritten including byte 0 as above. These same bytes are also set in lines 4-6 to whatever sub_1B1EF70 populates bytes 1-7 and bytes -24 – -17 of unkStruct with on line 1, but are completely discarded without being used by this overwriting _OWORD assignment. Bytes 11-15 subsequently get overwritten on line 12 (see below).
  • Bytes 11-25 are then set in line 12 (replacing bytes 11-15 in the assignment on line 9) to whatever unk_52725F1 is

This mess is a bit of a decompilation quirk – unk_52725E6 and unk_52725F1 are right next to each other in memory, so essentially all this code does is copy 26 bytes from unk_52725E6 into filename, overwrite the first character with m and then XOR all the rest with 0xFE:

.rodata:00000000052725C6 aFileloadexcept DCB "FileLoadException",0
.rodata:00000000052725D8 aModule         DCB "<Module>",0
.rodata:00000000052725E1 aNull_0         DCB "NULL",0
.rodata:00000000052725E6 unk_52725E6     DCB 0x93
.rodata:00000000052725E7                 DCB 0x8D
.rodata:00000000052725E8                 DCB 0x9D
.rodata:00000000052725E9                 DCB 0x91
.rodata:00000000052725EA                 DCB 0x8C
.rodata:00000000052725EB                 DCB 0x92
.rodata:00000000052725EC                 DCB 0x97
.rodata:00000000052725ED                 DCB 0x9C
.rodata:00000000052725EE                 DCB 0xD0
.rodata:00000000052725EF                 DCB 0x9A
.rodata:00000000052725F0                 DCB 0x92
.rodata:00000000052725F1 unk_52725F1     DCB 0x92
.rodata:00000000052725F2                 DCB 0xD3
.rodata:00000000052725F3                 DCB 0x8C
.rodata:00000000052725F4                 DCB 0x9B
.rodata:00000000052725F5                 DCB 0x8D
.rodata:00000000052725F6                 DCB 0x91
.rodata:00000000052725F7                 DCB 0x8B
.rodata:00000000052725F8                 DCB 0x8C
.rodata:00000000052725F9                 DCB 0x9D
.rodata:00000000052725FA                 DCB 0x9B
.rodata:00000000052725FB                 DCB 0x8D
.rodata:00000000052725FC                 DCB 0xD0
.rodata:00000000052725FD                 DCB 0x9A
.rodata:00000000052725FE                 DCB 0x9F
.rodata:00000000052725FF                 DCB 0x8A
.rodata:0000000005272600                 DCB    0
.rodata:0000000005272601 aErrorCouldNotO DCB "ERROR: Could not open %s",0
.rodata:000000000527261A aErrorCouldNotG DCB "ERROR: Could not get length %s",0
.rodata:0000000005272639 aErrorCouldNotA DCB "ERROR: Could not alloc memory size %ld",0

Notice how there are a bunch of unencrypted strings on either side….

Image result for dead giveaway

What do we get when we decrypt this string?

mscorlib.dll-resources.dat

Well, this file is certainly present but it doesn’t resemble a global-metadata.dat file. Is it an encrypted file? Is it combined with another file? Is it all a big lie and the metadata is embedded? That is for you to discover – once again, the idea here is to show how to get a foot – or maybe just a toe – in the door!

Example 4: Going in dry (Genshin Impact)

I love miHoYo, they troll us so badly with their IL2CPP obfuscation, but I won’t lie to your dear reader, unravelling this is the kind of nightmare fuel that keeps self-respecting hackers awake at night.

Fortunately, although the encryption itself is a pain, finding the code of interest is pretty straightforward. The string global-metadata.dat exists in the application binary and leads us to this function:

void sub_1857C29C0()
{
  v0 = sub_1857C2DC0(v15);
  v11 = "Metadata";
  v12 = 8i64;
  sub_18576E440(v17, v0, &v11);
  if ( v16 >= 0x10 )
  {
    v1 = v15[0];
    if ( v16 + 1 >= 0x1000 )
    {
      if ( (v15[0] & 0x1F) != 0 )
        invalid_parameter_noinfo_noreturn();
      v2 = *(v15[0] - 8);
      if ( v2 >= v15[0] )
        invalid_parameter_noinfo_noreturn();
      if ( v15[0] - v2 < 8 )
        invalid_parameter_noinfo_noreturn();
      if ( v15[0] - v2 > 0x27 )
        invalid_parameter_noinfo_noreturn();
      v1 = *(v15[0] - 8);
    }
    j_free_0(v1);
  }
  v16 = 15i64;
  v15[2] = 0i64;
  LOBYTE(v15[0]) = 0;
  v11 = "global-metadata.dat";
  v12 = 19i64;
  sub_18576E440(v13, v17, &v11);
  v19 = 0;
  v3 = sub_185793850(v13, 3i64, 1i64);
  v4 = v3;
  if ( !v19 )
  {
    v5 = sub_1857935F0(v3, &v19);
    if ( !v19 )
    {
      v6 = sub_1857C0E80(v4, 0i64, 0);
      xmmword_187AEF530(v6, v5);
    }
  }
  // ...

Comparing to the source code for MetadataLoader::LoadMetadataFromFile, we might guess that the calls to sub_18576E440 (lines 6 and 30) are utils::PathUtils::Combine – because there are two of them and they both take filenames in v11 as an argument – and if we click through this function we find string append calls, which lends credence to this theory. Clicking on sub_1857935F0 (line 36) reveals a function resembling os::File::Open, so we can now be pretty certain that we’ve found the correct function, albeit modified from the original. Notably, the filename argument to LoadMetadataFromFile has been removed and replaced with a hardcoded reference to global-metadata.dat in the function body itself. Plus, there is an additional call to a mystery function pointer on line 40 – after the call to sub_1857C0E80 or utils::MemoryMappedFile::Map on line 39 – which is not present in the original. This may be some kind of decryption function.

We find one cross-reference to this function pointer:

.data:0000000187AEF530 xmmword_187AEF530 xmmword ?             ; DATA XREF: il2cpp_init_security+3↑w

By looking at the original source code of the defined IL2CPP API calls – il2cpp-api.h – we can determine that il2cpp_init_security is not an API that exists in the standard IL2CPP API. The decompilation gives:

void __fastcall il2cpp_init_security(__int64 a1)
{
  *&xmmword_187AEF530 = *a1;
  qword_187AEF540 = *(a1 + 16);
}

The function takes a single argument – a pointer to 24 bytes of data – and stores it at the function pointer address we just identified.

We can examine UnityPlayer.dll to find the argument passed to il2cpp_init_security and thus identify the entry point of this mystery extra function call. We first find LoadIl2Cpp by performing a string search for il2cpp_init_security and searching for cross-references as we did for the standard il2cpp_init function previously:

    v183 = 0i64;
    v185 = 0i64;
    v186 = 68;
    LOBYTE(v184) = 0;
    sub_1805C99D0(&v183, "il2cpp_init_security", 20i64);
    v180 = (__int64 (__fastcall *)(_QWORD))sub_180AD0930(qword_181BFA680, &v183, 0i64);
    qword_181BFA848 = v180;
    if ( v183 && v184 )
    {
      sub_18078C090(v183, v186);
      v180 = qword_181BFA848;
    }

Assuming that sub_1805C99D0 loads the symbol il2cpp_init_security into the pointer v183 and sub_180AD0930 equates to LookupSymbol – as we saw earlier in our look at the PDB-annotated player – it’s reasonable to assume that qword_181BFA848 points to the entry point of il2cpp_init_security. We search for cross-references to this pointer to find the call site. It turns out to be in Il2CppInitializeFromMain:

  *(_QWORD *)&v7 = qword_181C02810;
  *((_QWORD *)&v7 + 1) = qword_181C02820;
  v8 = v7;
  v9 = qword_181C02830;
  qword_181BFA848(&v8);
  qword_181BFA850(sub_180ABD230);
  sub_180B46A70();
  sub_180B13060();
  qword_181BFA720(0i64);
  qword_181BFA888(a3, a4, 0i64);
  qword_181BFA870();
  qword_181BFA878();
  sub_180ABD2E0();
  il2cpp_init("IL2CPP Root Domain");

The call to il2cpp_init_security occurs on line 5, passing in values set on lines 1 and 2 – qword_181C02810 and qword_181C02820. Once again, we search for cross-references to determine what these values are set to, and find it in some hitherto unknown function, the rest of which doesn’t matter right now:

__int64 (__fastcall *sub_180E951A0())(int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, char)
{
  __int64 (__fastcall *result)(int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, char); // rax

  qword_181C02808 = (__int64)sub_1801A62E0;
  qword_181C02810 = (__int64)sub_1801A6830;
  qword_181C02818 = (__int64)sub_18012ED60;
  qword_181C02820 = (__int64)sub_18012F170;
  qword_181C02828 = (__int64)sub_18012EF30;
  qword_181C02830 = (__int64)sub_18012F390;
  qword_181C02838 = (__int64)sub_1801A4D70;

Finally we have identified the function called by MetadataLoader::LoadMetadataFromFile in the game binary as sub_1801A6830 in UnityPlayer.dll – the decryption code. To recap:

  1. The function above stores the address of sub_1801A6830 in qword_181C02810
  2. LoadIl2Cpp finds the entry point of il2cpp_init_security in the application binary and stores it in qword_181BFA848
  3. Il2CppInitializeFromMain calls il2cpp_init_security via the pointer set in step 2 using the fetched pointer from step 1 as the argument
  4. il2cpp_init_security stores the function pointer passed in step 3 to xmmword_187AEF530
  5. MetadataLoader::LoadMetadataFile calls the function pointer set in step 4 immediately after mapping global-metadata.dat into memory, essentially calling sub_1801A6830 in the Unity player

The actual function at sub_1801A6830 is a monster of obfuscated assembly code, but once again the point is to find where decryption occurs so that we can begin the process of reverse engineering it.

Going back to the game binary for a moment, we step up the call stack from MetadataLoader::LoadMetadataFromFile to the function which calls it. We expect to find MetadataCache::Initialize, and indeed we do:

void __fastcall sub_185756110(__int64 a1)
{
  sub_1857C29C0();
  qword_187AEF000 = v1;
  if ( !v1 )
  {
    sub_18575C4F0("########is NULL########\n");
    v1 = qword_187AEF000;
  }
  qword_187AEF008 = v1;
  v2 = (v1 + *(v1 + 120));
  v62 = v2;
  v3 = 0;
  if ( *(v1 + 124) / 0x44ui64 )
  {
    v4 = 0i64;
    do
    {
      sub_1857B1610(&v2[17 * v4]);
      v4 = ++v3;
    }
    while ( v3 < *(qword_187AEF008 + 124) / 0x44ui64 );
  }
  qword_187AEEED8 = j_j__calloc_base(*(qword_187AEEFE0 + 48), 8ui64);
  qword_187AEEEE0 = j_j__calloc_base(*(qword_187AEF008 + 84) / 0x68ui64, 8ui64);
  qword_187AEEEE8 = j_j__calloc_base(*(qword_187AEF008 + 300) >> 6, 8ui64);
  qword_187AEEEF8 = j_j__calloc_base(*(qword_187AEEFE0 + 64), 8ui64);
  dword_187AEEF00 = *(qword_187AEF008 + 116) >> 5;
  v5 = j_j__calloc_base(dword_187AEEF00, 0x38ui64);
  qword_187AEEF08 = v5;
  v6 = qword_187AEF000;
  v7 = (qword_187AEF000 + *(qword_187AEF008 + 112));

The first call (line 3) calls the MetadataLoader::LoadMetadataFile function we examined earlier, but no pointer to the decrypted metadata is returned. The uninitialized value v1 is used instead. The decompiler has slipped up in this case, and if we look at the highlighted lines which set (line 10) and access qword_187AEF008, it’s a pretty safe bet that this is the true pointer to the decrypted metadata. However, there is a problem. A closer examination of the header offsets referenced (eg. 84, 300 and 116) indicates that even after decryption, the header fields are not in their normal order – they have been rearranged as a form of obfuscation! Untangling this will require a more thorough dissection of the binary file’s code.

Info: An extensive treatise of this entire obfuscation scheme can be found in my mini-series on Honkai Impact (which uses a similar scheme to Genshin Impact but slightly simplified). You can read more about analyzing the control flow obfuscated decryption code in my VMProtect control flow analysis article where Honkai Impact is used as a case study.

Conclusion

I’ve only illustrated a small sampling of the wide variety of schemes currently being used to foil the acquisition of global-metadata.dat here, but as you can see the process of reverse engineering them all starts in more or less the same way:

  1. Check global-metadata.dat in a hex editor to see if it is present, or encrypted. If not present, check other files in the application folder that could be candidates.
  2. Find MetadataCache::Initialize and MetadataLoader::LoadMetadataFile in the application binary using the techniques above, either via a string cross-reference lookup for global-metadata.dat or by tracing the code from il2cpp_init down the call chain if the string is unavailable.
  3. If you need to trace the code but the il2cpp_init export is not present, examine UnityPlayer.dll or libunity.so to find the entry point.
  4. Compare MetadataCache::Initialize and MetadataLoader::LoadMetadataFile with the original source code to identify changes and additions made by the developers. These changes are likely to be where decryption and deobfuscation take place.
  5. If the decryption code can be called externally, write a small program to call the decryption function and save the resulting file (see the Sharpen your knives section at the bottom of this article on the blog for a complete walkthrough of how to do this).
  6. Otherwise, reverse engineer the discovered changed code thoroughly to determine how the obfuscation works and how to defeat it.
  7. Consider writing a plugin for Il2CppInspector so that the target application can be loaded as normal without having to edit the tooling’s source code directly.

I hope you found these walkthroughs interesting and helpful – now get out there and be a shark!

Categories: IL2CPP Tags:
  1. HH
    January 17, 2022 at 16:05

    Hi
    I tried to decompile an app using Il2cppInspertor. i found “global-metadata.dat” but there is no il2cpp binary file to attach in Il2cppInspertor
    how can i find the binary file ?
    best regards

    • Simp
      August 11, 2022 at 14:19

      I haven’t used il2cppinspector but you don’t need to decompile the apk. The shared-object binary is stored in lib// and global-metadata.dat in /assets/bin/Data/Managed/Metadata/, just like mentioned in the post.

  2. Jacob
    June 11, 2021 at 02:12

    The following checks don’t seem to be present(or at least have an obvious presence) in any of the binaries I’ve examined. Any ideas?

    IL2CPP_ASSERT(s_GlobalMetadataHeader->sanity == 0xFAB11BAF);
    IL2CPP_ASSERT(s_GlobalMetadataHeader->version == 24);
  3. May 5, 2021 at 02:50

    My approach in GDTS is to directly Hook LoadMetadataFile function to get the return value, and then convert the return value to Il2CppGlobalMetadataHeader type to get exportedTypeDefinitionsOffset and exportedTypeDefinitionsCount, add up to get the file size of global-metadata.dat, and then write it out, write it out After that, you need to fix the flag bit of the file so that you can get the decrypted file. Sorry for my shit-like English.

    • June 4, 2021 at 12:28

      Yep, hooking the return of LoadMetadataFile is going to be an almost universal way to get a pointer to the file in memory and I actually should have mentioned in the article that doing it this way via dynamic analysis is likely the simplest approach if there is no anti-tamper (eg. Themida) and you just want the file itself. When I was writing the article I was focused on actually learning how a particular obfuscation scheme works via static analysis reverse engineering, but for practical use, yes, you are absolutely right 🙂 Good catch!

    • Cheese
      September 13, 2021 at 03:34

      What tools did you use to do that? I tried using this frida script but it wouldn’t find the module.

      var awaitForCondition = function () {
           var int = setInterval(function () {
               
               var addr = Module.findBaseAddress('libil2cpp.so');
               console.log(addr)
               if (addr) {
                   console.log("Address found:", addr);
                   clearInterval(int);
                   return;
               }
           }, 0);
       }
      awaitForCondition()
  4. Puppet4473
    March 4, 2021 at 20:33

    Hey,

    first of all, thank you for your walkthrough, I appreciate it.

    Anyways, so I read this article for the third time now, and I understood most of it (I think :D), but I’m still struggling with the (probably?) most important part: actually decrypting the metadata.dat file.

    Correct me if I’m wrong, but the objective is to basically find the “il2cpp::vm::MetadataCache::Initialize” and “il2cpp::vm::MetadataLoader::LoadMetadataFile” function in the il2cpp binary of the game and to compare that with the source code of il2cpp (MetadataCache.cpp & MetadataLoader.cpp) in order to see if it has been changed by the developers. I got to this point, but I have no idea what to do next. How exactly does this help me to decrypt the metadata.dat file?

    This is a very noob-ish question, I know, but I’m actually stuck at this part.

    • Puppet4473
      March 4, 2021 at 20:42

      And is it possible to dump the memory of the game to obtain the decrypted version of the file? Isn’t that a viable option?

    • March 5, 2021 at 17:55

      Hey Puppet!

      Thanks for the kind words. You are 100% correct, that is the objective. The differences between the original IL2CPP source code and the code you find in the target app in those two specific functions are likely to be the obfuscation/encryption code, so the purpose of the article is merely to help you find the starting point basically, ie. the relevant code to retrieving/deobfuscating the metadata.

      I don’t talk in the article about how to specifically reverse engineer any particular type of obfuscation because they are all wildly different and mostly unique to each app/game, it’s entirely dependent on what the developers have concocted so there is no “one size fits all” approach to the specifics of a single obfuscation scheme.

      Generally if the code itself is not obfuscated, it’s sufficient to just sit in the decompiler and work your way through it to determine how it works, then you can replicate the behaviour yourself, or sometimes it’s possible to isolate the decryption to a single function call and you can load the binary standalone and call the function to invoke the decryption without needing to understand how it works (see the miHoYo plugin / Honkai Impact articles for an example of that).

      For example if you find an inserted piece of code by the developer that adds 1 to every byte in the metadata’s file buffer – which is obviously not present in the original IL2CPP code – you can say aha, they have ‘encrypted’ it like this, then write a program which does the same thing to decrypt the file. So you basically have to do a lot of detective work! Understanding how to read the decompilation is very difficult at first, but the more you do it, the faster you’ll become at it, so just take your time and try to find any low-hanging fruit like variables with obvious uses. As you gradually start renaming things, the overall purpose of the code will become more clear.

      You mentioned dumping the memory of the running process – yes, this is possible sometimes. Some apps disallow it, but more usually it’s the case that either the entire metadata file is not decrypted at once, in its entirety, in a single location in memory, with valid headers and everything else. There ARE cases like that so it’s always worth to dump the process and search for the magic header bytes before you get knee deep into the code itself, but most of the time you’re gonna have to piece it together yourself.

      Sorry this is rather vague, it’s hard to give specific advice for specific apps, but I hope it helps point you in the right direction!

      • Puppet4473
        March 9, 2021 at 20:49

        Thanks for the reply.

        So I went ahead and analyzed the MetadataCache:Initialize function of the game and I also created an empty unity project with the same unity version so that I can compare these.

        I found out that the “LoadMetadataFile” function is probably not responsible for the decryption process since it’s pretty much identical with the original – however, I found some differences between the MetadataCache:Initialize functions.

        I know that you made this thread so you don’t have to waste your time helping other people with their problems, but would you mind taking a look at my approach? I’m not expecting you to do actually it, but I would appreciate it.

        This .zip contains a text file of my analysis of the function with proper variable names and comments, the MetadataCache.cpp for the corresponding IL2CPP version, and the MetadataCache:Initialize function with and without symbols of an empty IL2CPP unity project.

        https://mega.nz/file/ooFBhYpb#AxZjGpUtc0JAm6OHdPlbMlgPeFiyFyO2ozPSt76l3To

        I also tried to dump the memory, but the game force closed afterwards, so there’s that 😀

        • March 11, 2021 at 20:44

          Hey again!

          Your approach is exactly correct and you’ve done a very good job of the analysis. Bear in mind that compiler optimizations and the loss of information when going from C++ to machine code can cause the decompiler to produce slightly different code to the original without any suspicious meaning.

          Your next step would be to define v18. From the code:

          *(v18 + 72LL * v16 + 8) = v20;

          this likely means that v18 is an array of structs – each 72 bytes long – with v16 as the index, and 8 as the offset into the particular struct field. From your earlier symbols it seems like v18 is the image table as you might expect. If you import Il2CppImage (or whichever the correct struct is) and define v18 as an array of that type, you’ll be able to see much more clearly what happens to v20.

          I actually think it probably just copies the image name to that location and is probably not anything nefarious, but I only looked at the code for 2 minutes. Since there’s only the odd line changed here or there and no obvious large blocks of new code or function calls, my guess would be that the metadata file is already decrypted by the time it is returned from LoadMetadataFile.

          You can search for references to byte_2CD5000 to see where it is set and what else the setting function does. It may or may not be relevant.

          I don’t think there is anything missing at the bottom where you commented, it looks like it was just optimized away by the compiler. But I might be wrong; really the best approach to go from here is to import as many structs as you can as it will make the remaining code vastly easier to read. If you don’t find any substantial differences after that (and usually you can ignore all that assembly/image initialization code btw), then you’ll need to look elsewhere.

          Hope that helps!

  1. No trackbacks yet.

Share your thoughts! Note: to post source code, enclose it in [code lang=...] [/code] tags. Valid values for 'lang' are cpp, csharp, xml, javascript, php etc. To post compiler errors or other text that is best read monospaced, use 'text' as the value for lang.

This site uses Akismet to reduce spam. Learn how your comment data is processed.