Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2023112600 build of GrapheneOS causes bootloop if Magisk is installed as system server crashes #213

Closed
pixincreate opened this issue Nov 27, 2023 · 10 comments
Assignees

Comments

@pixincreate
Copy link

pixincreate commented Nov 27, 2023

Posting this here to let you and everyone be aware of this.

Once if you install 2023112600 build of GrapheneOS, do not flash Magisk to the other slot or whatsoever.
The reason being, the changes done under the hood is an advance in terms of hardening memory and its allocation (I did not care much to dig in to understand what's going on at this point).

Once you install the latest version, just reboot the device (With avbroot, the procedure would a be different, but make sure you that you do not install it directly). You'll have to boot your device without root once and Disable Hardened memory allocator and enable native code debugging. Then proceed to install Magisk as usual.

Well, I was wrong. The entirety has been broken:

  • Networking module does not work meaning no Signal and no WiFi.
  • Storage does not work, I cannot access files through apps
@pixincreate pixincreate changed the title Latest GOS build causing bootloop when Magisk is installed 2023112600 build of GrapheneOS causes bootloop if Magisk is installed as system server crashes Nov 27, 2023
@chenxiaolong chenxiaolong self-assigned this Nov 28, 2023
@chenxiaolong
Copy link
Owner

Did you happen to have Zygisk enabled? I definitely wouldn't be surprised if the new GrapheneOS changes impacted that. Zygisk is quite invasive.

Once you install the latest version, just reboot the device (With avbroot, the procedure would a be different, but make sure you that you do not install it directly). You'll have to boot your device without root once and Disable Hardened memory allocator and enable native code debugging. Then proceed to install Magisk as usual.

Folks using avbroot can patch the OTA once using --rootless instead of --magisk <apk> to get an unrooted system.

@pixincreate
Copy link
Author

pixincreate commented Nov 28, 2023

I was never able to get Zygisk working. It always said 'Reboot to turn on Zygisk'. At least I wished root worked as expected. After disabling hardened memory, I was able to get phone working but at the costs of network module dying and no storage (trying to access these 2 would result in crash)

@chenxiaolong
Copy link
Owner

That's interesting and makes me even more curious about why it'd be breaking parts of the system. It's been ages since I last looked at Magisk's internals, but the changes it used to make were mostly additive. At first glance, it looks like nowadays, Magisk does make some changes that could conflict with the types of security improvements GrapheneOS makes (eg. it blocks everything besides Magisk itself from loading SELinux policies).

Does adb logcat | grep avc: print any denied messages when the network or storage components crash?

@pixincreate
Copy link
Author

pixincreate commented Nov 28, 2023

I was looking at GOS and came up with this: GrapheneOS/os-issue-tracker#2725

Does adb logcat | grep avc: print any denied messages when the network or storage components crash?

I was able to see a lot of avc: denied exceptions yesterday

@stavae
Copy link

stavae commented Dec 1, 2023

Patching boot.img manually in magisk (didnt know avbroot existed) has the same issue :/

@ccoager
Copy link

ccoager commented Dec 3, 2023

Perhaps this is fixed already, I haven't tried a nightly build though.
topjohnwu/Magisk#7489

@pixincreate
Copy link
Author

pixincreate commented Dec 3, 2023

The latest CI build did fix a lot issues except the Zygisk (it never turns on) of course:

12-03 23:56:51.158   899   899 E : zygisk64: unknown signature of nativeForkAndSpecialize(II[II[[IILjava/lang/String;Ljava/lang/String;[I[IZLjava/lang/String;Ljava/lang/String;Z[Ljava/lang/String;[Ljava/lang/String;ZZ[J)I
12-03 23:56:51.158   899   899 E : zygisk64: unknown signature of nativeSpecializeAppProcess(II[II[[IILjava/lang/String;Ljava/lang/String;ZLjava/lang/String;Ljava/lang/String;Z[Ljava/lang/String;[Ljava/lang/String;ZZ[J)V
12-03 23:56:51.158   899   899 I : zygisk64: replace nativeForkSystemServer
12-03 23:56:51.621   900   900 E : zygisk32: unknown signature of nativeForkAndSpecialize(II[II[[IILjava/lang/String;Ljava/lang/String;[I[IZLjava/lang/String;Ljava/lang/String;Z[Ljava/lang/String;[Ljava/lang/String;ZZ[J)I
12-03 23:56:51.621   900   900 E : zygisk32: unknown signature of nativeSpecializeAppProcess(II[II[[IILjava/lang/String;Ljava/lang/String;ZLjava/lang/String;Ljava/lang/String;Z[Ljava/lang/String;[Ljava/lang/String;ZZ[J)V

@pixincreate
Copy link
Author

pixincreate commented Dec 7, 2023

HuskyDG/magisk-files@f3f4eff

Seems to have fixed it in Magisk Delta. Even Zygisk works

@thestinger
Copy link

It broke because we added a parameter to pass SELinux-related flags in order to support per-app toggles for hardening features rather than those hardening features being exclusive to the base OS. We used this to replace our global ptrace toggle with a per-app toggle, which is kept compatible by default. Users can disable ptrace per-app or globally, then enable it for apps requiring it, and they'll know which apps require it because we generate a clear notification referencing the feature when apps try and fail to use it. We plan to add a toggle for dynamic native code execution too, which will block in-memory and on-storage dynamic native code execution with the same approach to notifying users and granting exceptions. None of this breaks CTS compatibility despite what's claimed in the locked issue. Apps all continue working by default because these are opt-in features. The CTS doesn't disallow us from adding parameters to internal functions used only inside the OS and not exposed to apps.

Magisk seems to have worked fine from when it was launched until September 2023 when we added this parameter. It's not our fault that these things break from making entirely internal changes to the OS. It's not officially supported by GrapheneOS and we aren't going to limit our hardening work due to avoiding changes which break really hacky patching of the OS which inherently loses a lot of security features we put a lot of work into. We didn't intentionally break any of this, we just aren't going to stop doing our hardening work because unsupported stuff has issues with it. None of this work broke apps.

There are apps with memory corruption bugs which break due to hardened_malloc but we have a per-app compatibility mode for that reason. These are 100% apps bugs and it does not break the CTS to harden against memory corruption by detecting memory corruption bugs with 0 false positives. CTS does not require supporting memory corruption working without being detected. Stock Pixel OS can detect most of the same issues as our hardening features by enabling the hardware memory tagging via developer options on the Pixel 8 and then enabling it for userspace via adb shell. It will likely eventually use memory tagging in async mode by default and app developers would save themselves future trouble by fixing their bugs sooner rather than later. Many app developers have fixed memory corruption bugs due to reports from GrapheneOS users, benefiting the whole ecosystem and clearing the path for Google to deploy better protections without nearly as much disruption to those app developers.

Not clear why people are being so hostile towards us in that issue thread. We clearly didn't add this feature to break things. One of the main reasons we warn people against patching the OS this way is because we expect things to break like this and can't really do anything about it. It also throws away a lot of security, regardless of whether users don't grant root access to any apps. It doesn't fit well with what we're trying to do. We haven't stopped people doing it, we just discourage it, and this should show why it was a good idea to discourage it.

Another issue is that patched images end up breaking delta updates because source images don't match target images. That leads to repeatedly downloading and trying to install updates with it failing in a way that it cannot fail for users who installed the OS in the official way.

@thestinger
Copy link

Our recommendation to people who want root access is making a userdebug build with ro.adb.secure=1. This is officially supported. It gives you adb root and adb shell. It does not provide a system for granting persistent app accessible root. We're aware people grant apps network ADB access, but network ADB on GrapheneOS has been hardened to prevent it being possible for it to persist after reboot. Network ADB requires regular ADB to enable so it can't be considered worse than regular ADB as long as network ADB persistence is blocked. ADB should definitely be far more limited on user builds but ADB is how they run the CTS and that's why it has so much access. A lot of what ADB can do is only because CTS uses it for testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants