Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added z3 dts & dtb #7

Conversation

amousa1990
Copy link

taken from shinano-castor, voltages checked against dotnstream device-tree.

taken from shinano-castor, voltages checked against dotnstream device-tree.
@z3ntu
Copy link
Member

z3ntu commented Nov 4, 2022

Move the dts to arch/arm/boot/dts/, add it to arch/arm/boot/dts/Makefile and delete the dtb from git

@amousa1990
Copy link
Author

Done. let me know if anything else is needed.

@z3ntu
Copy link
Member

z3ntu commented Nov 6, 2022

Can you please send dmesg of booted device?

@amousa1990
Copy link
Author

amousa1990 commented Nov 7, 2022

sony-leo:~$ cat dmesg
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 5.19.9-postmarketos-qcom-msm8974 ([email protected]) (armv7-alpine-linux-musleabihf-gcc (Alpine 12.2.1_git20220924-r4) 12.2.1 20220924, GNU ld (GNU Binutils) 2.39) #3 SMP PREEMPT Thu Nov 3 08:18:42 UTC 2022
[    0.000000] CPU: ARMv7 Processor [512f06f1] revision 1 (ARMv7), cr=10c5787d
[    0.000000] CPU: div instructions available: patching division code
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[    0.000000] OF: fdt: Machine model: Sony Xperia Z2 Tablet
[    0.000000] Memory policy: Data cache writealloc
[    0.000000] cma: Reserved 256 MiB at 0xd0000000
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000002fffffff]
[    0.000000]   HighMem  [mem 0x0000000030000000-0x00000000dfffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000]   node   0: [mem 0x0000000008000000-0x000000000fefffff]
[    0.000000]   node   0: [mem 0x000000000ff00000-0x000000005fffffff]
[    0.000000]   node   0: [mem 0x0000000080000000-0x00000000dfffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x00000000dfffffff]
[    0.000000] percpu: Embedded 16 pages/cpu s33940 r8192 d23404 u65536
[    0.000000] pcpu-alloc: s33940 r8192 d23404 u65536 alloc=16*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 784704
[    0.000000] Kernel command line: PMOS_NO_OUTPUT_REDIRECT androidboot.emmc=true androidboot.bootloader=s1 oemandroidboot.s1boot=1286-7314_S1_Boot_MSM8974AC_LA3.0_M_3 androidboot.serialno=CB5A2AAAXB ta_info=1,16,256 startup=0x00004000 warmboot=0x00000000 oemandroidboot.imei=3583770657633800 oemandroidboot.phoneid=0000:3583770657633800,0000:3583770657633700 oemandroidboot.security=0 oemandroidboot.babe08b3=50000000 oemandroidboot.securityflags=0x00000003 lcdid_adc=0x90C04 display_status=off androidboot.baseband=msm
[    0.000000] Unknown kernel command line parameters "PMOS_NO_OUTPUT_REDIRECT ta_info=1,16,256 startup=0x00004000 warmboot=0x00000000 lcdid_adc=0x90C04 display_status=off", will be passed to user space.
[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 2703356K/3145728K available (11264K kernel code, 1564K rwdata, 4560K rodata, 1024K init, 287K bss, 180228K reserved, 262144K cma-reserved, 2097152K highmem)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] trace event string verifier disabled
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000]  Trampoline variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] arch_timer: cp15 and mmio timer(s) running at 19.20MHz (virt/virt).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x46d987e47, max_idle_ns: 440795202767 ns
[    0.000002] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps every 4398046511078ns
[    0.000017] Switching to timer-based delay loop, resolution 52ns
[    0.000225] Console: colour dummy device 80x30
[    0.000695] printk: console [tty0] enabled
[    0.000745] Calibrating delay loop (skipped), value calculated using timer frequency.. 38.40 BogoMIPS (lpj=192000)
[    0.000784] pid_max: default: 32768 minimum: 301
[    0.001011] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
[    0.001045] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
[    0.001945] CPU: Testing write buffer coherency: ok
[    0.002313] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.002356] qcom_scm: convention: smc legacy
[    0.003398] cblist_init_generic: Setting adjustable number of callback queues.
[    0.003428] cblist_init_generic: Setting shift to 2 and lim to 1.
[    0.003604] Setting up static identity map for 0x300000 - 0x300060
[    0.003797] rcu: Hierarchical SRCU implementation.
[    0.003818] rcu:     Max phase no-delay instances is 1000.
[    0.005100] smp: Bringing up secondary CPUs ...
[    0.006148] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.007408] CPU2: thread -1, cpu 2, socket 0, mpidr 80000002
[    0.008677] CPU3: thread -1, cpu 3, socket 0, mpidr 80000003
[    0.008884] smp: Brought up 1 node, 4 CPUs
[    0.008945] SMP: Total of 4 processors activated (153.60 BogoMIPS).
[    0.008970] CPU: All CPU(s) started in SVC mode.
[    0.009824] devtmpfs: initialized
[    0.022970] VFP support v0.3: implementor 51 architecture 64 part 6f variant 2 rev 1
[    0.023302] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.023348] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[    0.036385] pinctrl core: initialized pinctrl subsystem
[    0.038092] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.040365] DMA: preallocated 256 KiB pool for atomic coherent allocations
[    0.041792] thermal_sys: Registered thermal governor 'step_wise'
[    0.043434] cpuidle: using governor menu
[    0.043790] hw-breakpoint: Failed to enable monitor mode on CPU 0.
[    0.089713] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[    0.100411] iommu: Default domain type: Translated
[    0.100450] iommu: DMA domain TLB invalidation policy: strict mode
[    0.101289] SCSI subsystem initialized
[    0.101637] libata version 3.00 loaded.
[    0.102031] usbcore: registered new interface driver usbfs
[    0.102135] usbcore: registered new interface driver hub
[    0.102230] usbcore: registered new device driver usb
[    0.102458] mc: Linux media interface: v0.10
[    0.102541] videodev: Linux video capture interface: v2.00
[    0.103854] Advanced Linux Sound Architecture Driver Initialized.
[    0.105466] vgaarb: loaded
[    0.105912] clocksource: Switched to clocksource arch_sys_counter
[    0.121520] NET: Registered PF_INET protocol family
[    0.121868] IP idents hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.124615] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
[    0.124684] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.124737] TCP established hash table entries: 8192 (order: 3, 32768 bytes, linear)
[    0.124832] TCP bind hash table entries: 8192 (order: 4, 65536 bytes, linear)
[    0.124979] TCP: Hash tables configured (established 8192 bind 8192)
[    0.125110] UDP hash table entries: 512 (order: 2, 16384 bytes, linear)
[    0.125177] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes, linear)
[    0.125430] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.126198] RPC: Registered named UNIX socket transport module.
[    0.126235] RPC: Registered udp transport module.
[    0.126263] RPC: Registered tcp transport module.
[    0.126291] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.126328] PCI: CLS 0 bytes, default 64
[    0.127101] Trying to unpack rootfs image as initramfs...
[    0.136038] hw perfevents: enabled with armv7_krait PMU driver, 5 counters available
[    0.138379] Initialise system trusted keyrings
[    0.138737] workingset: timestamp_bits=14 max_order=20 bucket_order=6
[    0.149704] NFS: Registering the id_resolver key type
[    0.149828] Key type id_resolver registered
[    0.149861] Key type id_legacy registered
[    0.151578] Key type cifs.idmap registered
[    0.273842] Freeing initrd memory: 2012K
[    0.287513] Key type asymmetric registered
[    0.287549] Asymmetric key parser 'x509' registered
[    0.289933] alg: self-tests for CTR-KDF (hmac(sha256)) passed
[    0.290209] bounce: pool size: 64 pages
[    0.290399] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249)
[    0.290679] io scheduler mq-deadline registered
[    0.290713] io scheduler kyber registered
[    0.406561] msm_serial f991e000.serial: msm_serial: detected port #0
[    0.406651] msm_serial f991e000.serial: uartclk = 7372800
[    0.406822] f991e000.serial: ttyMSM0 at MMIO 0xf991e000 (irq = 37, base_baud = 460800) is a MSM
[    0.406897] msm_serial: console setup on port #0
[    1.147197] printk: console [ttyMSM0] enabled
[    1.153117] msm_serial f995d000.serial: msm_serial: detected port #1
[    1.156332] msm_serial f995d000.serial: uartclk = 19200000
[    1.162818] f995d000.serial: ttyMSM1 at MMIO 0xf995d000 (irq = 38, base_baud = 1200000) is a MSM
[    1.168302] serial serial0: tty port ttyMSM1 registered
[    1.177283] msm_serial: driver initialized
[    1.186279] platform fd922800.dsi: Fixing up cyclic dependency with fd900100.mdp
[    1.209964] brd: module loaded
[    1.222192] loop: module loaded
[    1.222995] SCSI Media Changer driver v0.25
[    1.225224] spmi spmi-0: PMIC arbiter version v1 (0x20000002)
[    1.230661] gpio gpiochip1: (fc4cf000.spmi:pm8841@4:mpps@a000): not an immutable chip, please consider fixing it!
[    1.244044] gpio gpiochip2: (fc4cf000.spmi:pm8941@0:gpios@c000): not an immutable chip, please consider fixing it!
[    1.246621] gpio gpiochip3: (fc4cf000.spmi:pm8941@0:mpps@a000): not an immutable chip, please consider fixing it!
[    1.263078] SLIP: version 0.8.4-NET3.019-NEWTTY (dynamic channels, max=256) (6 bit encapsulation enabled).
[    1.265148] CSLIP: code copyright 1989 Regents of the University of California.
[    1.276341] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.281907] ehci-pci: EHCI PCI platform driver
[    1.289379] usbcore: registered new interface driver cdc_acm
[    1.292909] cdc_acm: USB Abstract Control Model driver for USB modems and ISDN adapters
[    1.301372] UDC core: g_ether: couldn't find an available UDC
[    1.309319] rtc-pm8xxx fc4cf000.spmi:pm8941@0:rtc@6000: registered as rtc0
[    1.312414] rtc-pm8xxx fc4cf000.spmi:pm8941@0:rtc@6000: setting system clock to 1970-04-20T20:14:48 UTC (9490488)
[    1.319348] i2c_dev: i2c /dev entries driver
[    1.329946] i2c_qup f9964000.i2c:
[    1.329946]  tx channel not available
[    1.340445] spmi-temp-alarm fc4cf000.spmi:pm8841@4:temp-alarm@2400: failed to register sensor
[    1.341606] device-mapper: ioctl: 4.47.0-ioctl (2022-07-28) initialised: [email protected]
[    1.351309] sdhci: Secure Digital Host Controller Interface driver
[    1.357801] sdhci: Copyright(c) Pierre Ossman
[    1.363738] sdhci-pltfm: SDHCI platform and OF driver helper
[    1.370953] usbcore: registered new interface driver usbhid
[    1.373901] usbhid: USB HID core driver
[    1.380261] extcon-pm8941-misc fc4cf000.spmi:pm8941@0:misc@900: error -ENXIO: IRQ usb_vbus not found
[    1.383756] SPI driver bmp280 has no spi_device_id for bosch,bmp085
[    1.397628] Initializing XFRM netlink socket
[    1.399075] NET: Registered PF_INET6 protocol family
[    1.404763] Segment Routing with IPv6
[    1.407959] In-situ OAM (IOAM) with IPv6
[    1.411491] NET: Registered PF_PACKET protocol family
[    1.415418] NET: Registered PF_KEY protocol family
[    1.420432] 8021q: 802.1Q VLAN Support v1.8
[    1.425114] Key type dns_resolver registered
[    1.429462] Registering SWP/SWPB emulation handler
[    1.434041] Loading compiled-in X.509 certificates
[    1.513998] s4: Bringing 5100000uV into 5000000-5000000uV
[    1.530344] qcom-smbb fc4cf000.spmi:pm8941@0:charger@1000: Initializing SMBB rev 3
[    1.549387] ------------[ cut here ]------------
[    1.549436] WARNING: CPU: 0 PID: 8 at drivers/clk/qcom/clk-rcg2.c:133 update_config+0xe4/0xf0
[    1.550364] sdhci_msm f98a4900.sdhci: Got CD GPIO
[    1.553070] sdcc1_apps_clk_src: rcg didn't update its configuration.
[    1.553076] Modules linked in:
[    1.572611] CPU: 0 PID: 8 Comm: kworker/u8:0 Not tainted 5.19.9-postmarketos-qcom-msm8974 #3
[    1.575483] Hardware name: Generic DT based system
[    1.584072] Workqueue: events_unbound async_run_entry_fn
[    1.588682] [<c0311414>] (unwind_backtrace) from [<c030c090>] (show_stack+0x10/0x14)
[    1.594147] [<c030c090>] (show_stack) from [<c0d68ab4>] (dump_stack_lvl+0x40/0x4c)
[    1.601876] [<c0d68ab4>] (dump_stack_lvl) from [<c0322dd8>] (__warn+0xc8/0x16c)
[    1.609255] [<c0322dd8>] (__warn) from [<c0d63758>] (warn_slowpath_fmt+0x78/0xa4)
[    1.616460] [<c0d63758>] (warn_slowpath_fmt) from [<c08002b4>] (update_config+0xe4/0xf0)
[    1.624096] [<c08002b4>] (update_config) from [<c080075c>] (clk_rcg2_configure+0xa8/0xb0)
[    1.632257] [<c080075c>] (clk_rcg2_configure) from [<c07f4760>] (clk_change_rate+0x98/0x550)
[    1.640331] [<c07f4760>] (clk_change_rate) from [<c07f50f8>] (clk_core_set_rate_nolock+0x1d0/0x2b8)
[    1.648837] [<c07f50f8>] (clk_core_set_rate_nolock) from [<c07f5210>] (clk_set_rate+0x30/0x154)
[    1.657604] [<c07f5210>] (clk_set_rate) from [<c0ad062c>] (dev_pm_opp_set_rate+0x198/0x228)
[    1.666287] [<c0ad062c>] (dev_pm_opp_set_rate) from [<c0b03ce4>] (sdhci_msm_probe+0x2bc/0xa50)
[    1.674617] [<c0b03ce4>] (sdhci_msm_probe) from [<c09070b0>] (platform_probe+0x5c/0xb0)
[    1.683300] [<c09070b0>] (platform_probe) from [<c0904430>] (really_probe+0x174/0x40c)
[    1.691198] [<c0904430>] (really_probe) from [<c0904768>] (__driver_probe_device+0xa0/0x204)
[    1.699184] [<c0904768>] (__driver_probe_device) from [<c0904900>] (driver_probe_device+0x34/0xc4)
[    1.707778] [<c0904900>] (driver_probe_device) from [<c0904fa4>] (__device_attach_driver+0xac/0x128)
[    1.716549] [<c0904fa4>] (__device_attach_driver) from [<c090234c>] (bus_for_each_drv+0x80/0xcc)
[    1.725833] [<c090234c>] (bus_for_each_drv) from [<c0903d34>] (__device_attach_async_helper+0xac/0x100)
[    1.734603] [<c0903d34>] (__device_attach_async_helper) from [<c034b1ac>] (async_run_entry_fn+0x40/0x15c)
[    1.743721] [<c034b1ac>] (async_run_entry_fn) from [<c0340a30>] (process_one_work+0x1e8/0x510)
[    1.753440] [<c0340a30>] (process_one_work) from [<c0341154>] (worker_thread+0x48/0x504)
[    1.761944] [<c0341154>] (worker_thread) from [<c0347768>] (kthread+0xec/0x11c)
[    1.770190] [<c0347768>] (kthread) from [<c0300148>] (ret_from_fork+0x14/0x2c)
[    1.777218] Exception stack(0xf083dfb0 to 0xf083dff8)
[    1.784504] dfa0:                                     00000000 00000000 00000000 00000000
[    1.789639] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    1.797798] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    1.806094] ---[ end trace 0000000000000000 ]---
[    1.845952] mmc1: SDHCI controller on f9864900.sdhci [f9864900.sdhci] using ADMA
[    1.850882] s1: Bringing 0uV into 675000-675000uV
[    1.853108] s2: Bringing 0uV into 500000-500000uV
[    1.857544] s3: Bringing 0uV into 500000-500000uV
[    1.862163] s4: Bringing 0uV into 500000-500000uV
[    1.867460] s1: Bringing 0uV into 1300000-1300000uV
[    1.873085] s2: Bringing 0uV into 2150000-2150000uV
[    1.876836] s3: Bringing 0uV into 1800000-1800000uV
[    1.881768] s4: Bringing 0uV into 5000000-5000000uV
[    1.885953] l1: Bringing 0uV into 1225000-1225000uV
[    1.891361] l2: Bringing 0uV into 1200000-1200000uV
[    1.895730] l3: Bringing 0uV into 1200000-1200000uV
[    1.900571] l4: Bringing 0uV into 1225000-1225000uV
[    1.905447] l5: Bringing 0uV into 1800000-1800000uV
[    1.906558] ocmem fdd00000.ocmem: 8 ports, 3 regions, 2048 macros, interleaved
[    1.910267] l6: Bringing 0uV into 1800000-1800000uV
[    1.916483] msm_mdp fd900100.mdp: No interconnect support may cause display underflows!
[    1.923314] sdhci_msm f98a4900.sdhci: Got CD GPIO
[    1.925111] l7: Bringing 0uV into 1800000-1800000uV
[    1.925746] l8: Bringing 0uV into 1800000-1800000uV
[    1.926445] l9: Bringing 0uV into 1800000-1800000uV
[    1.927444] l11: Bringing 0uV into 1300000-1300000uV
[    1.937225] debugfs: Directory 'ci_hdrc.0' with parent 'ulpi' already present!
[    1.940195] l12: Bringing 0uV into 1800000-1800000uV
[    1.950118] sdhci_msm f98a4900.sdhci: Got CD GPIO
[    1.954892] l13: Bringing 0uV into 1800000-1800000uV
[    1.966974] debugfs: Directory 'ci_hdrc.0' with parent 'ulpi' already present!
[    1.976475] l14: Bringing 0uV into 1800000-1800000uV
[    1.989014] l15: Bringing 0uV into 2050000-2050000uV
[    1.989561] sdhci_msm f98a4900.sdhci: Got CD GPIO
[    1.990057] debugfs: Directory 'ci_hdrc.0' with parent 'ulpi' already present!
[    1.993478] dsi_regulator_init: failed to init regulator, ret=-517
[    1.993669] l16: Bringing 0uV into 2700000-2700000uV
[    1.994237] l17: Bringing 0uV into 2700000-2700000uV
[    1.994666] l18: Bringing 0uV into 2850000-2850000uV
[    1.995225] l19: Bringing 0uV into 2850000-2850000uV
[    1.995771] l20: Bringing 0uV into 2950000-2950000uV
[    1.998742] l21: Bringing 0uV into 2950000-2950000uV
[    2.003164] msm_dsi_host_init: regulator init failed
[    2.010701] l22: Bringing 0uV into 3000000-3000000uV
[    2.018506] input: gpio-keys as /devices/platform/gpio-keys/input/input0
[    2.021654] l23: Bringing 0uV into 2800000-2800000uV
[    2.027789] ALSA device list:
[    2.028462] debugfs: Directory 'ci_hdrc.0' with parent 'ulpi' already present!
[    2.030689] dsi_regulator_init: failed to init regulator, ret=-517
[    2.030700] msm_dsi_host_init: regulator init failed
[    2.031919] l24: Bringing 0uV into 3075000-3075000uV
[    2.032389] sdhci_msm f98a4900.sdhci: Got CD GPIO
[    2.036202]   No soundcards found.
[    2.044152] debugfs: Directory 'ci_hdrc.0' with parent 'ulpi' already present!
[    2.051171] mmc1: new ultra high speed SDR104 SDIO card at address 0001
[    2.086135] mmc0: SDHCI controller on f9824900.sdhci [f9824900.sdhci] using ADMA 64-bit
[    2.115839] using random self ethernet address
[    2.123218] using random host ethernet address
[    2.124033] usb0: HOST MAC 7a:e2:18:e8:a1:f4
[    2.126632] usb0: MAC f2:6f:d6:50:9f:41
[    2.131069] using random self ethernet address
[    2.134569] using random host ethernet address
[    2.139224] g_ether gadget.0: Ethernet Gadget, version: Memorial Day 2008
[    2.143576] g_ether gadget.0: g_ether ready
[    2.146390] mmc2: SDHCI controller on f98a4900.sdhci [f98a4900.sdhci] using ADMA
[    2.150846] ci_hdrc ci_hdrc.0: EHCI Host Controller
[    2.162472] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
[    2.163123] Freeing unused kernel image (initmem) memory: 1024K
[    2.174391] l24: voltage operation not allowed
[    2.206108] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
[    2.206400] Run /init as init process
[    2.206681] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.19
[    2.210494]   with arguments:
[    2.214138] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    2.214151]     /init
[    2.217128] mmc2: new ultra high speed SDR104 SDXC card at address 5048
[    2.218000] mmcblk2: mmc2:5048 SD128 116 GiB
[    2.220821]  mmcblk2: p1
[    2.222322] usb usb1: Product: EHCI Host Controller
[    2.222333] usb usb1: Manufacturer: Linux 5.19.9-postmarketos-qcom-msm8974 ehci_hcd
[    2.222352]     PMOS_NO_OUTPUT_REDIRECT
[    2.229523] usb usb1: SerialNumber: ci_hdrc.0
[    2.236029]   with environment:
[    2.241381] hub 1-0:1.0: USB hub found
[    2.243122]     HOME=/
[    2.247801] hub 1-0:1.0: 1 port detected
[    2.255360]     TERM=linux
[    2.263601]     ta_info=1,16,256
[    2.263613]     startup=0x00004000
[    2.263623]     warmboot=0x00000000
[    2.263631]     lcdid_adc=0x90C04
[    2.263640]     display_status=off
[    2.293951] mmc0: new HS200 MMC card at address 0001
[    2.294950] mmcblk0: mmc0:0001 016GE2 14.7 GiB
[    2.301643]  mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 p20 p21 p22 p23 p24 p25
[    2.308497] mmcblk0boot0: mmc0:0001 016GE2 4.00 MiB
[    2.314702] mmcblk0boot1: mmc0:0001 016GE2 4.00 MiB
[    2.319454] mmcblk0rpmb: mmc0:0001 016GE2 4.00 MiB, chardev (244:0)
[    2.356098] lp855x 1-002c: device config err: -6
[    2.695955] usb 1-1: new high-speed USB device number 2 using ci_hdrc
[    2.896879] usb 1-1: New USB device found, idVendor=214b, idProduct=7250, bcdDevice= 1.00
[    2.896947] usb 1-1: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[    2.904093] usb 1-1: Product: USB2.0 HUB
[    2.912310] hub 1-1:1.0: USB hub found
[    2.915382] hub 1-1:1.0: 4 ports detected
[   13.686034] usb 1-1.4: new high-speed USB device number 3 using ci_hdrc
[   13.847996] usb 1-1.4: New USB device found, idVendor=1058, idProduct=1021, bcdDevice=20.21
[   13.848055] usb 1-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   13.855190] usb 1-1.4: Product: Ext HDD 1021
[   13.862512] usb 1-1.4: Manufacturer: Western Digital
[   13.866997] usb 1-1.4: SerialNumber: 57434156354C353931353235
[   15.307893] using random self ethernet address
[   15.307956] using random host ethernet address
[   15.330599] UDC core: g1: couldn't find an available UDC or it's busy
[   15.827005] EXT4-fs (dm-0): mounted filesystem without journal. Quota mode: disabled.
[   18.460341] EXT4-fs (dm-1): INFO: recovery required on readonly filesystem
[   18.460398] EXT4-fs (dm-1): write access will be enabled during recovery
[   18.638900] EXT4-fs (dm-1): recovery complete
[   18.640046] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Quota mode: disabled.
[   18.649233] EXT4-fs (dm-0): unmounting filesystem.
[   19.066979] EXT4-fs (dm-0): warning: mounting unchecked fs, running e2fsck is recommended
[   19.068405] EXT4-fs (dm-0): mounted filesystem without journal. Quota mode: disabled.
[   21.012985] udevd[918]: starting version 3.2.11
[   21.445953] random: crng init done
[   21.479464] udevd[918]: starting eudev-3.2.11
[   21.789135] Bluetooth: Core ver 2.22
[   21.789298] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[   21.789864] NET: Registered PF_BLUETOOTH protocol family
[   21.789877] Bluetooth: HCI device and connection manager initialized
[   21.790103] Bluetooth: HCI socket layer initialized
[   21.790125] Bluetooth: L2CAP socket layer initialized
[   21.792728] Bluetooth: SCO socket layer initialized
[   21.836960] Bluetooth: HCI UART driver ver 2.3
[   21.836980] Bluetooth: HCI UART protocol H4 registered
[   21.837391] Bluetooth: HCI UART protocol Broadcom registered
[   21.838413] hci_uart_bcm serial0-0: supply vbat not found, using dummy regulator
[   21.864900] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[   21.865754] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[   21.865776] cfg80211: failed to load regulatory.db
[   21.888652] hci_uart_bcm serial0-0: supply vddio not found, using dummy regulator
[   21.937161] input: pm8941_pwrkey as /devices/platform/soc/fc4cf000.spmi/spmi-0/0-00/fc4cf000.spmi:pm8941@0:pwrkey@800/input/input1
[   21.959229] usb-storage 1-1.4:1.0: USB Mass Storage device detected
[   21.960913] scsi host0: usb-storage 1-1.4:1.0
[   21.961385] usbcore: registered new interface driver usb-storage
[   22.151357] spmi-temp-alarm fc4cf000.spmi:pm8941@0:temp-alarm@2400: failed to register sensor
[   22.163612] usbcore: registered new interface driver uas
[   23.038706] scsi 0:0:0:0: Direct-Access     WD       Ext HDD 1021     2021 PQ: 0 ANSI: 4
[   23.039801] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   23.041235] sd 0:0:0:0: [sda] 1465143296 512-byte logical blocks: (750 GB/699 GiB)
[   23.043229] sd 0:0:0:0: [sda] Write Protect is off
[   23.043250] sd 0:0:0:0: [sda] Mode Sense: 17 00 10 08
[   23.045050] sd 0:0:0:0: [sda] No Caching mode page found
[   23.045067] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   23.457377]  sda: sda1
[   23.457962] sd 0:0:0:0: [sda] Attached SCSI disk
[   23.821591] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac4339-sdio for chip BCM4339/2
[   24.063345] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac4339-sdio for chip BCM4339/2
[   24.064323] brcmfmac: brcmf_c_process_clm_blob: no clm_blob available (err=-2), device may have limited channels available
[   24.064810] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4339/2 wl0: Sep  5 2019 11:05:52 version 6.37.39.113 (r722271 CY)
[   24.155999] Bluetooth: hci1: command 0xfc18 tx timeout
[   27.996023] rmi4_i2c 0-002c: rmi_set_page: set page failed: -110.
[   27.996043] rmi4_i2c 0-002c: Failed to set page select to 0
[   27.997256] rmi4_i2c: probe of 0-002c failed with error -110
[   28.197673] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[   28.199748] Bridge firewalling registered
[   28.708782] EXT4-fs (dm-1): re-mounted. Quota mode: disabled.
[   32.476027] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
[   32.476052] Bluetooth: hci1: Failed to set baudrate
[   32.476270] l7: disabling
[   33.065478] EXT4-fs (sda1): recovery complete
[   33.066645] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: disabled.
[   34.565991] Bluetooth: hci1: command 0x0c03 tx timeout
[   35.379195] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   42.716027] Bluetooth: hci1: BCM: Reset failed (-110)
[   42.716075] Bluetooth: hci1: hardware error 0x00
[   44.795966] Bluetooth: hci1: command 0xfc18 tx timeout
[   52.956019] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
[   52.956041] Bluetooth: hci1: Failed to set baudrate
[   55.036007] Bluetooth: hci1: command 0x0c03 tx timeout
[   63.196034] Bluetooth: hci1: BCM: Reset failed (-110)
[ 6277.083267] cni0: port 1(vethf85f4c2b) entered blocking state
[ 6277.083289] cni0: port 1(vethf85f4c2b) entered disabled state
[ 6277.083587] device vethf85f4c2b entered promiscuous mode
[ 6277.157339] cni0: port 2(vethb166ce12) entered blocking state
[ 6277.157363] cni0: port 2(vethb166ce12) entered disabled state
[ 6277.157687] device vethb166ce12 entered promiscuous mode
[ 6277.157829] cni0: port 2(vethb166ce12) entered blocking state
[ 6277.157845] cni0: port 2(vethb166ce12) entered forwarding state
[ 6277.173147] cni0: port 2(vethb166ce12) entered disabled state
[ 6277.173278] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 6277.173482] IPv6: ADDRCONF(NETDEV_CHANGE): vethf85f4c2b: link becomes ready
[ 6277.173998] cni0: port 1(vethf85f4c2b) entered blocking state
[ 6277.174015] cni0: port 1(vethf85f4c2b) entered forwarding state
[ 6277.248698] IPv6: ADDRCONF(NETDEV_CHANGE): vethb166ce12: link becomes ready
[ 6277.249227] cni0: port 2(vethb166ce12) entered blocking state
[ 6277.249246] cni0: port 2(vethb166ce12) entered forwarding state
[ 6277.903499] cni0: port 3(veth08add738) entered blocking state
[ 6277.903522] cni0: port 3(veth08add738) entered disabled state
[ 6277.903969] device veth08add738 entered promiscuous mode
[ 6277.904095] cni0: port 3(veth08add738) entered blocking state
[ 6277.904111] cni0: port 3(veth08add738) entered forwarding state
[ 6278.002423] IPv6: ADDRCONF(NETDEV_CHANGE): veth08add738: link becomes ready
[ 6278.836824] cni0: port 4(veth32a48a8b) entered blocking state
[ 6278.836846] cni0: port 4(veth32a48a8b) entered disabled state
[ 6278.837153] device veth32a48a8b entered promiscuous mode
[ 6278.837286] cni0: port 4(veth32a48a8b) entered blocking state
[ 6278.837302] cni0: port 4(veth32a48a8b) entered forwarding state
[ 6278.889920] IPv6: ADDRCONF(NETDEV_CHANGE): veth32a48a8b: link becomes ready
[ 6279.748511] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P2954 } 3 jiffies s: 281 root: 0x0/T
[ 6279.748557] rcu: blocking rcu_node structures (internal RCU debug):
[ 6280.106800] cni0: port 5(veth8763990d) entered blocking state
[ 6280.106824] cni0: port 5(veth8763990d) entered disabled state
[ 6280.114293] device veth8763990d entered promiscuous mode
[ 6280.154632] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 6280.154826] IPv6: ADDRCONF(NETDEV_CHANGE): veth8763990d: link becomes ready
[ 6280.155396] cni0: port 5(veth8763990d) entered blocking state
[ 6280.155414] cni0: port 5(veth8763990d) entered forwarding state
[ 6283.225668] IPVS: Registered protocols (TCP, UDP)
[ 6283.225723] IPVS: Connection hash table configured (size=4096, memory=16Kbytes)
[ 6283.226509] IPVS: ipvs loaded.
[ 6283.292624] IPVS: [rr] scheduler registered.

@z3ntu
Copy link
Member

z3ntu commented Nov 8, 2022

Please double check that you're using the correct dtb when testing, in your dmesg you're not using the dts you added here.

Also you are booting kernel 5.19.9 but you're adding this dts to 5.18 branch. This we can fix ourselves though, so don't worry about this.

@minlexx
Copy link
Member

minlexx commented Nov 8, 2022

Done. let me know if anything else is needed.

The dts file needs to be moved to arch/arm/boot/dts: git mv ...

@amousa1990
Copy link
Author

Please double check that you're using the correct dtb when testing, in your dmesg you're not using the dts you added here.

Also you are booting kernel 5.19.9 but you're adding this dts to 5.18 branch. This we can fix ourselves though, so don't worry about this.

it's the same dts, but the one I used on the phone I had forgotten to change the following line:
model = "Sony Xperia Z2 Tablet";
compatible = "sony,xperia-castor", "qcom,msm8974";

to:
model = "Sony Xperia Z3";
compatible = "sony,xperia-z3", "qcom,msm8974";

everything else is the same

@amousa1990
Copy link
Author

I have reflashed using the dtb that I had uploaded and here is my dmesg:
https://pastebin.com/z8dbivNu

@amousa1990
Copy link
Author

Done. let me know if anything else is needed.

The dts file needs to be moved to arch/arm/boot/dts: git mv ...

I have added the dts to arc/arm/boot/dts

@minlexx
Copy link
Member

minlexx commented Nov 20, 2022

image

I see the that the .dts file was not moved, it's still at the root directory level.

This is not a big deal though, but missing S-o-B line is (

@minlexx
Copy link
Member

minlexx commented Nov 20, 2022

Commit should hava a Signed-off-by: Your Name <your@email> line at the end.

Without authorship information we can't send code to linux upstream.

z3ntu pushed a commit that referenced this pull request Dec 18, 2022
[ Upstream commit 4cc47e8 ]

We shouldn't be calling runtime PM APIs from within the genpd
enable/disable path for a couple reasons.

First, this causes an AA lockdep splat[1] because genpd can call into
genpd code again while holding the genpd lock.

WARNING: possible recursive locking detected
5.19.0-rc2-lockdep+ #7 Not tainted
--------------------------------------------
kworker/2:1/49 is trying to acquire lock:
ffffffeea0370788 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x24/0x30

but task is already holding lock:
ffffffeea03710a8 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x24/0x30

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&genpd->mlock);
  lock(&genpd->mlock);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by kworker/2:1/49:
 #0: 74ffff80811a5748 ((wq_completion)pm){+.+.}-{0:0}, at: process_one_work+0x320/0x5fc
 #1: ffffffc008537cf8 ((work_completion)(&genpd->power_off_work)){+.+.}-{0:0}, at: process_one_work+0x354/0x5fc
 #2: ffffffeea03710a8 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x24/0x30

stack backtrace:
CPU: 2 PID: 49 Comm: kworker/2:1 Not tainted 5.19.0-rc2-lockdep+ #7
Hardware name: Google Lazor (rev3 - 8) with KB Backlight (DT)
Workqueue: pm genpd_power_off_work_fn
Call trace:
 dump_backtrace+0x1a0/0x200
 show_stack+0x24/0x30
 dump_stack_lvl+0x7c/0xa0
 dump_stack+0x18/0x44
 __lock_acquire+0xb38/0x3634
 lock_acquire+0x180/0x2d4
 __mutex_lock_common+0x118/0xe30
 mutex_lock_nested+0x70/0x7c
 genpd_lock_mtx+0x24/0x30
 genpd_runtime_suspend+0x2f0/0x414
 __rpm_callback+0xdc/0x1b8
 rpm_callback+0x4c/0xcc
 rpm_suspend+0x21c/0x5f0
 rpm_idle+0x17c/0x1e0
 __pm_runtime_idle+0x78/0xcc
 gdsc_disable+0x24c/0x26c
 _genpd_power_off+0xd4/0x1c4
 genpd_power_off+0x2d8/0x41c
 genpd_power_off_work_fn+0x60/0x94
 process_one_work+0x398/0x5fc
 worker_thread+0x42c/0x6c4
 kthread+0x194/0x1b4
 ret_from_fork+0x10/0x20

Second, this confuses runtime PM on CoachZ for the camera devices by
causing the camera clock controller's runtime PM usage_count to go
negative after resuming from suspend. This is because runtime PM is
being used on the clock controller while runtime PM is disabled for the
device.

The reason for the negative count is because a GDSC is represented as a
genpd and each genpd that is attached to a device is resumed during the
noirq phase of system wide suspend/resume (see the noirq suspend ops
assignment in pm_genpd_init() for more details). The camera GDSCs are
attached to camera devices with the 'power-domains' property in DT.
Every device has runtime PM disabled in the late system suspend phase
via __device_suspend_late(). Runtime PM is not usable until runtime PM
is enabled in device_resume_early(). The noirq phases run after the
'late' and before the 'early' phase of suspend/resume. When the genpds
are resumed in genpd_resume_noirq(), we call down into gdsc_enable()
that calls pm_runtime_resume_and_get() and that returns -EACCES to
indicate failure to resume because runtime PM is disabled for all
devices.

Upon closer inspection, calling runtime PM APIs like this in the GDSC
driver doesn't make sense. It was intended to make sure the GDSC for the
clock controller providing other GDSCs was enabled, specifically the
MMCX GDSC for the display clk controller on SM8250 (sm8250-dispcc), so
that GDSC register accesses succeeded. That will already happen because
we make the 'dev->pm_domain' a parent domain of each GDSC we register in
gdsc_register() via pm_genpd_add_subdomain(). When any of these GDSCs
are accessed, we'll enable the parent domain (in this specific case
MMCX).

We also remove any getting of runtime PM during registration, because
when a genpd is registered it increments the count on the parent if the
genpd itself is already enabled.

Cc: Dmitry Baryshkov <[email protected]>
Cc: Johan Hovold <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Taniya Das <[email protected]>
Cc: Satya Priya <[email protected]>
Reviewed-by: Douglas Anderson <[email protected]>
Tested-by: Douglas Anderson <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Reported-by: Stephen Boyd <[email protected]>
Link: https://lore.kernel.org/r/CAE-0n52xbZeJ66RaKwggeRB57fUAwjvxGxfFMKOKJMKVyFTe+w@mail.gmail.com [1]
Fixes: 1b77183 ("clk: qcom: gdsc: enable optional power domain support")
Signed-off-by: Stephen Boyd <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Johan Hovold <[email protected]>
Reviewed-by: Johan Hovold <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
z3ntu pushed a commit that referenced this pull request Dec 18, 2022
commit 89d21e2 upstream.

test_bpf tail call tests end up as:

  test_bpf: #0 Tail call leaf jited:1 85 PASS
  test_bpf: #1 Tail call 2 jited:1 111 PASS
  test_bpf: #2 Tail call 3 jited:1 145 PASS
  test_bpf: #3 Tail call 4 jited:1 170 PASS
  test_bpf: #4 Tail call load/store leaf jited:1 190 PASS
  test_bpf: #5 Tail call load/store jited:1
  BUG: Unable to handle kernel data access on write at 0xf1b4e000
  Faulting instruction address: 0xbe86b710
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE PAGE_SIZE=4K MMU=Hash PowerMac
  Modules linked in: test_bpf(+)
  CPU: 0 PID: 97 Comm: insmod Not tainted 6.1.0-rc4+ #195
  Hardware name: PowerMac3,1 750CL 0x87210 PowerMac
  NIP:  be86b710 LR: be857e88 CTR: be86b704
  REGS: f1b4df20 TRAP: 0300   Not tainted  (6.1.0-rc4+)
  MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 28008242  XER: 00000000
  DAR: f1b4e000 DSISR: 42000000
  GPR00: 00000001 f1b4dfe0 c11d2280 00000000 00000000 00000000 00000002 00000000
  GPR08: f1b4e000 be86b704 f1b4e000 00000000 00000000 100d816a f2440000 fe73baa8
  GPR16: f2458000 00000000 c1941ae4 f1fe2248 00000045 c0de0000 f2458030 00000000
  GPR24: 000003e8 0000000f f2458000 f1b4dc90 3e584b46 00000000 f24466a0 c1941a00
  NIP [be86b710] 0xbe86b710
  LR [be857e88] __run_one+0xec/0x264 [test_bpf]
  Call Trace:
  [f1b4dfe0] [00000002] 0x2 (unreliable)
  Instruction dump:
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  ---[ end trace 0000000000000000 ]---

This is a tentative to write above the stack. The problem is encoutered
with tests added by commit 38608ee ("bpf, tests: Add load store
test case for tail call")

This happens because tail call is done to a BPF prog with a different
stack_depth. At the time being, the stack is kept as is when the caller
tail calls its callee. But at exit, the callee restores the stack based
on its own properties. Therefore here, at each run, r1 is erroneously
increased by 32 - 16 = 16 bytes.

This was done that way in order to pass the tail call count from caller
to callee through the stack. As powerpc32 doesn't have a red zone in
the stack, it was necessary the maintain the stack as is for the tail
call. But it was not anticipated that the BPF frame size could be
different.

Let's take a new approach. Use register r4 to carry the tail call count
during the tail call, and save it into the stack at function entry if
required. This means the input parameter must be in r3, which is more
correct as it is a 32 bits parameter, then tail call better match with
normal BPF function entry, the down side being that we move that input
parameter back and forth between r3 and r4. That can be optimised later.

Doing that also has the advantage of maximising the common parts between
tail calls and a normal function exit.

With the fix, tail call tests are now successfull:

  test_bpf: #0 Tail call leaf jited:1 53 PASS
  test_bpf: #1 Tail call 2 jited:1 115 PASS
  test_bpf: #2 Tail call 3 jited:1 154 PASS
  test_bpf: #3 Tail call 4 jited:1 165 PASS
  test_bpf: #4 Tail call load/store leaf jited:1 101 PASS
  test_bpf: #5 Tail call load/store jited:1 141 PASS
  test_bpf: #6 Tail call error path, max count reached jited:1 994 PASS
  test_bpf: #7 Tail call count preserved across function calls jited:1 140975 PASS
  test_bpf: #8 Tail call error path, NULL target jited:1 110 PASS
  test_bpf: #9 Tail call error path, index out of range jited:1 69 PASS
  test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]

Suggested-by: Naveen N. Rao <[email protected]>
Fixes: 51c66ad ("powerpc/bpf: Implement extended BPF on PPC32")
Cc: [email protected]
Signed-off-by: Christophe Leroy <[email protected]>
Tested-by: Naveen N. Rao <[email protected]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/757acccb7fbfc78efa42dcf3c974b46678198905.1669278887.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <[email protected]>
@amousa1990
Copy link
Author

I'm completely lost on how this is done, I have searched but foudn nothing.
do you have any documentation I could follow?

@wonderfulShrineMaidenOfParadise
Copy link
Contributor

@z3ntu
Copy link
Member

z3ntu commented Dec 27, 2022

In the meantime (and before this one was opened) I've also made a dts with most working components and pushed this to the 6.0 branch: 8f0171e

I still need to check and test some parts and I'll send upstream then. Lk2nd code is also working and merged.

@z3ntu z3ntu closed this Dec 27, 2022
@amousa1990
Copy link
Author

that's great, what do you have working at the moment? have you gotten mhl to work?

@z3ntu
Copy link
Member

z3ntu commented Dec 28, 2022

Currently not much in that branch, but display, touch, wifi, bluetooth shouldn't be too hard to get working later. MHL not yet for sure, I think somebody tried on Nexus 5 for a while to get it working but couldn't so I don't think it's an easy component to support.

z3ntu pushed a commit that referenced this pull request Jun 15, 2023
Balance as exclusive state is compatible with paused balance and device
add, which makes some things more complicated. The assertion of valid
states when starting from paused balance needs to take into account two
more states, the combinations can be hit when there are several threads
racing to start balance and device add. This won't typically happen when
the commands are started from command line.

Scenario 1: With exclusive_operation state == BTRFS_EXCLOP_NONE.

Concurrently adding multiple devices to the same mount point and
btrfs_exclop_finish executed finishes before assertion in
btrfs_exclop_balance, exclusive_operation will changed to
BTRFS_EXCLOP_NONE state which lead to assertion failed:

  fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE ||
  fs_info->exclusive_operation == BTRFS_EXCLOP_DEV_ADD,
  in fs/btrfs/ioctl.c:456
  Call Trace:
   <TASK>
   btrfs_exclop_balance+0x13c/0x310
   ? memdup_user+0xab/0xc0
   ? PTR_ERR+0x17/0x20
   btrfs_ioctl_add_dev+0x2ee/0x320
   btrfs_ioctl+0x9d5/0x10d0
   ? btrfs_ioctl_encoded_write+0xb80/0xb80
   __x64_sys_ioctl+0x197/0x210
   do_syscall_64+0x3c/0xb0
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

Scenario 2: With exclusive_operation state == BTRFS_EXCLOP_BALANCE_PAUSED.

Concurrently adding multiple devices to the same mount point and
btrfs_exclop_balance executed finish before the latter thread execute
assertion in btrfs_exclop_balance, exclusive_operation will changed to
BTRFS_EXCLOP_BALANCE_PAUSED state which lead to assertion failed:

  fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE ||
  fs_info->exclusive_operation == BTRFS_EXCLOP_DEV_ADD ||
  fs_info->exclusive_operation == BTRFS_EXCLOP_NONE,
  fs/btrfs/ioctl.c:458
  Call Trace:
   <TASK>
   btrfs_exclop_balance+0x240/0x410
   ? memdup_user+0xab/0xc0
   ? PTR_ERR+0x17/0x20
   btrfs_ioctl_add_dev+0x2ee/0x320
   btrfs_ioctl+0x9d5/0x10d0
   ? btrfs_ioctl_encoded_write+0xb80/0xb80
   __x64_sys_ioctl+0x197/0x210
   do_syscall_64+0x3c/0xb0
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

An example of the failed assertion is below, which shows that the
paused balance is also needed to be checked.

  root@syzkaller:/home/xsk# ./repro
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  [  416.611428][ T7970] BTRFS info (device loop0): fs_info exclusive_operation: 0
  Failed to add device /dev/vda, errno 14
  [  416.613973][ T7971] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.615456][ T7972] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.617528][ T7973] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.618359][ T7974] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.622589][ T7975] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.624034][ T7976] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.626420][ T7977] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.627643][ T7978] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.629006][ T7979] BTRFS info (device loop0): fs_info exclusive_operation: 3
  [  416.630298][ T7980] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  Failed to add device /dev/vda, errno 14
  [  416.632787][ T7981] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.634282][ T7982] BTRFS info (device loop0): fs_info exclusive_operation: 3
  Failed to add device /dev/vda, errno 14
  [  416.636202][ T7983] BTRFS info (device loop0): fs_info exclusive_operation: 3
  [  416.637012][ T7984] BTRFS info (device loop0): fs_info exclusive_operation: 1
  Failed to add device /dev/vda, errno 14
  [  416.637759][ T7984] assertion failed: fs_info->exclusive_operation ==
  BTRFS_EXCLOP_BALANCE || fs_info->exclusive_operation ==
  BTRFS_EXCLOP_DEV_ADD || fs_info->exclusive_operation ==
  BTRFS_EXCLOP_NONE, in fs/btrfs/ioctl.c:458
  [  416.639845][ T7984] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  [  416.640485][ T7984] CPU: 0 PID: 7984 Comm: repro Not tainted 6.2.0 #7
  [  416.641172][ T7984] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  [  416.642090][ T7984] RIP: 0010:btrfs_assertfail+0x2c/0x2e
  [  416.644423][ T7984] RSP: 0018:ffffc90003ea7e28 EFLAGS: 00010282
  [  416.645018][ T7984] RAX: 00000000000000cc RBX: 0000000000000000 RCX: 0000000000000000
  [  416.645763][ T7984] RDX: ffff88801d030000 RSI: ffffffff81637e7c RDI: fffff520007d4fb7
  [  416.646554][ T7984] RBP: ffffffff8a533de0 R08: 00000000000000cc R09: 0000000000000000
  [  416.647299][ T7984] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff8a533da0
  [  416.648041][ T7984] R13: 00000000000001ca R14: 000000005000940a R15: 0000000000000000
  [  416.648785][ T7984] FS:  00007fa2985d4640(0000) GS:ffff88802cc00000(0000) knlGS:0000000000000000
  [  416.649616][ T7984] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  416.650238][ T7984] CR2: 0000000000000000 CR3: 0000000018e5e000 CR4: 0000000000750ef0
  [  416.650980][ T7984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [  416.651725][ T7984] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [  416.652502][ T7984] PKRU: 55555554
  [  416.652888][ T7984] Call Trace:
  [  416.653241][ T7984]  <TASK>
  [  416.653527][ T7984]  btrfs_exclop_balance+0x240/0x410
  [  416.654036][ T7984]  ? memdup_user+0xab/0xc0
  [  416.654465][ T7984]  ? PTR_ERR+0x17/0x20
  [  416.654874][ T7984]  btrfs_ioctl_add_dev+0x2ee/0x320
  [  416.655380][ T7984]  btrfs_ioctl+0x9d5/0x10d0
  [  416.655822][ T7984]  ? btrfs_ioctl_encoded_write+0xb80/0xb80
  [  416.656400][ T7984]  __x64_sys_ioctl+0x197/0x210
  [  416.656874][ T7984]  do_syscall_64+0x3c/0xb0
  [  416.657346][ T7984]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
  [  416.657922][ T7984] RIP: 0033:0x4546af
  [  416.660170][ T7984] RSP: 002b:00007fa2985d4150 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  [  416.660972][ T7984] RAX: ffffffffffffffda RBX: 00007fa2985d4640 RCX: 00000000004546af
  [  416.661714][ T7984] RDX: 0000000000000000 RSI: 000000005000940a RDI: 0000000000000003
  [  416.662449][ T7984] RBP: 00007fa2985d41d0 R08: 0000000000000000 R09: 00007ffee37a4c4f
  [  416.663195][ T7984] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fa2985d4640
  [  416.663951][ T7984] R13: 0000000000000009 R14: 000000000041b320 R15: 00007fa297dd4000
  [  416.664703][ T7984]  </TASK>
  [  416.665040][ T7984] Modules linked in:
  [  416.665590][ T7984] ---[ end trace 0000000000000000 ]---
  [  416.666176][ T7984] RIP: 0010:btrfs_assertfail+0x2c/0x2e
  [  416.668775][ T7984] RSP: 0018:ffffc90003ea7e28 EFLAGS: 00010282
  [  416.669425][ T7984] RAX: 00000000000000cc RBX: 0000000000000000 RCX: 0000000000000000
  [  416.670235][ T7984] RDX: ffff88801d030000 RSI: ffffffff81637e7c RDI: fffff520007d4fb7
  [  416.671050][ T7984] RBP: ffffffff8a533de0 R08: 00000000000000cc R09: 0000000000000000
  [  416.671867][ T7984] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff8a533da0
  [  416.672685][ T7984] R13: 00000000000001ca R14: 000000005000940a R15: 0000000000000000
  [  416.673501][ T7984] FS:  00007fa2985d4640(0000) GS:ffff88802cc00000(0000) knlGS:0000000000000000
  [  416.674425][ T7984] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  416.675114][ T7984] CR2: 0000000000000000 CR3: 0000000018e5e000 CR4: 0000000000750ef0
  [  416.675933][ T7984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [  416.676760][ T7984] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Link: https://lore.kernel.org/linux-btrfs/[email protected]/
CC: [email protected] # 6.1+
Signed-off-by: xiaoshoukui <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
z3ntu pushed a commit that referenced this pull request Jun 15, 2023
In the function ieee80211_tx_dequeue() there is a particular locking
sequence:

begin:
	spin_lock(&local->queue_stop_reason_lock);
	q_stopped = local->queue_stop_reasons[q];
	spin_unlock(&local->queue_stop_reason_lock);

However small the chance (increased by ftracetest), an asynchronous
interrupt can occur in between of spin_lock() and spin_unlock(),
and the interrupt routine will attempt to lock the same
&local->queue_stop_reason_lock again.

This will cause a costly reset of the CPU and the wifi device or an
altogether hang in the single CPU and single core scenario.

The only remaining spin_lock(&local->queue_stop_reason_lock) that
did not disable interrupts was patched, which should prevent any
deadlocks on the same CPU/core and the same wifi device.

This is the probable trace of the deadlock:

kernel: ================================
kernel: WARNING: inconsistent lock state
kernel: 6.3.0-rc6-mt-20230401-00001-gf86822a1170f #4 Tainted: G        W
kernel: --------------------------------
kernel: inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kernel: kworker/5:0/25656 [HC0[0]:SC0[0]:HE1:SE1] takes:
kernel: ffff9d6190779478 (&local->queue_stop_reason_lock){+.?.}-{2:2}, at: return_to_handler+0x0/0x40
kernel: {IN-SOFTIRQ-W} state was registered at:
kernel:   lock_acquire+0xc7/0x2d0
kernel:   _raw_spin_lock+0x36/0x50
kernel:   ieee80211_tx_dequeue+0xb4/0x1330 [mac80211]
kernel:   iwl_mvm_mac_itxq_xmit+0xae/0x210 [iwlmvm]
kernel:   iwl_mvm_mac_wake_tx_queue+0x2d/0xd0 [iwlmvm]
kernel:   ieee80211_queue_skb+0x450/0x730 [mac80211]
kernel:   __ieee80211_xmit_fast.constprop.66+0x834/0xa50 [mac80211]
kernel:   __ieee80211_subif_start_xmit+0x217/0x530 [mac80211]
kernel:   ieee80211_subif_start_xmit+0x60/0x580 [mac80211]
kernel:   dev_hard_start_xmit+0xb5/0x260
kernel:   __dev_queue_xmit+0xdbe/0x1200
kernel:   neigh_resolve_output+0x166/0x260
kernel:   ip_finish_output2+0x216/0xb80
kernel:   __ip_finish_output+0x2a4/0x4d0
kernel:   ip_finish_output+0x2d/0xd0
kernel:   ip_output+0x82/0x2b0
kernel:   ip_local_out+0xec/0x110
kernel:   igmpv3_sendpack+0x5c/0x90
kernel:   igmp_ifc_timer_expire+0x26e/0x4e0
kernel:   call_timer_fn+0xa5/0x230
kernel:   run_timer_softirq+0x27f/0x550
kernel:   __do_softirq+0xb4/0x3a4
kernel:   irq_exit_rcu+0x9b/0xc0
kernel:   sysvec_apic_timer_interrupt+0x80/0xa0
kernel:   asm_sysvec_apic_timer_interrupt+0x1f/0x30
kernel:   _raw_spin_unlock_irqrestore+0x3f/0x70
kernel:   free_to_partial_list+0x3d6/0x590
kernel:   __slab_free+0x1b7/0x310
kernel:   kmem_cache_free+0x52d/0x550
kernel:   putname+0x5d/0x70
kernel:   do_sys_openat2+0x1d7/0x310
kernel:   do_sys_open+0x51/0x80
kernel:   __x64_sys_openat+0x24/0x30
kernel:   do_syscall_64+0x5c/0x90
kernel:   entry_SYSCALL_64_after_hwframe+0x72/0xdc
kernel: irq event stamp: 5120729
kernel: hardirqs last  enabled at (5120729): [<ffffffff9d149936>] trace_graph_return+0xd6/0x120
kernel: hardirqs last disabled at (5120728): [<ffffffff9d149950>] trace_graph_return+0xf0/0x120
kernel: softirqs last  enabled at (5069900): [<ffffffff9cf65b60>] return_to_handler+0x0/0x40
kernel: softirqs last disabled at (5067555): [<ffffffff9cf65b60>] return_to_handler+0x0/0x40
kernel:
        other info that might help us debug this:
kernel:  Possible unsafe locking scenario:
kernel:        CPU0
kernel:        ----
kernel:   lock(&local->queue_stop_reason_lock);
kernel:   <Interrupt>
kernel:     lock(&local->queue_stop_reason_lock);
kernel:
         *** DEADLOCK ***
kernel: 8 locks held by kworker/5:0/25656:
kernel:  #0: ffff9d618009d138 ((wq_completion)events_freezable){+.+.}-{0:0}, at: process_one_work+0x1ca/0x530
kernel:  #1: ffffb1ef4637fe68 ((work_completion)(&local->restart_work)){+.+.}-{0:0}, at: process_one_work+0x1ce/0x530
kernel:  #2: ffffffff9f166548 (rtnl_mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40
kernel:  #3: ffff9d6190778728 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: return_to_handler+0x0/0x40
kernel:  #4: ffff9d619077b480 (&mvm->mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40
kernel:  #5: ffff9d61907bacd8 (&trans_pcie->mutex){+.+.}-{3:3}, at: return_to_handler+0x0/0x40
kernel:  #6: ffffffff9ef9cda0 (rcu_read_lock){....}-{1:2}, at: iwl_mvm_queue_state_change+0x59/0x3a0 [iwlmvm]
kernel:  #7: ffffffff9ef9cda0 (rcu_read_lock){....}-{1:2}, at: iwl_mvm_mac_itxq_xmit+0x42/0x210 [iwlmvm]
kernel:
        stack backtrace:
kernel: CPU: 5 PID: 25656 Comm: kworker/5:0 Tainted: G        W          6.3.0-rc6-mt-20230401-00001-gf86822a1170f #4
kernel: Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN51WW 11/16/2022
kernel: Workqueue: events_freezable ieee80211_restart_work [mac80211]
kernel: Call Trace:
kernel:  <TASK>
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  dump_stack_lvl+0x5f/0xa0
kernel:  dump_stack+0x14/0x20
kernel:  print_usage_bug.part.46+0x208/0x2a0
kernel:  mark_lock.part.47+0x605/0x630
kernel:  ? sched_clock+0xd/0x20
kernel:  ? trace_clock_local+0x14/0x30
kernel:  ? __rb_reserve_next+0x5f/0x490
kernel:  ? _raw_spin_lock+0x1b/0x50
kernel:  __lock_acquire+0x464/0x1990
kernel:  ? mark_held_locks+0x4e/0x80
kernel:  lock_acquire+0xc7/0x2d0
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  ? ftrace_return_to_handler+0x8b/0x100
kernel:  ? preempt_count_add+0x4/0x70
kernel:  _raw_spin_lock+0x36/0x50
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  ieee80211_tx_dequeue+0xb4/0x1330 [mac80211]
kernel:  ? prepare_ftrace_return+0xc5/0x190
kernel:  ? ftrace_graph_func+0x16/0x20
kernel:  ? 0xffffffffc02ab0b1
kernel:  ? lock_acquire+0xc7/0x2d0
kernel:  ? iwl_mvm_mac_itxq_xmit+0x42/0x210 [iwlmvm]
kernel:  ? ieee80211_tx_dequeue+0x9/0x1330 [mac80211]
kernel:  ? __rcu_read_lock+0x4/0x40
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_mvm_mac_itxq_xmit+0xae/0x210 [iwlmvm]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_mvm_queue_state_change+0x311/0x3a0 [iwlmvm]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_mvm_wake_sw_queue+0x17/0x20 [iwlmvm]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_txq_gen2_unmap+0x1c9/0x1f0 [iwlwifi]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_txq_gen2_free+0x55/0x130 [iwlwifi]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_txq_gen2_tx_free+0x63/0x80 [iwlwifi]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  _iwl_trans_pcie_gen2_stop_device+0x3f3/0x5b0 [iwlwifi]
kernel:  ? _iwl_trans_pcie_gen2_stop_device+0x9/0x5b0 [iwlwifi]
kernel:  ? mutex_lock_nested+0x4/0x30
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_trans_pcie_gen2_stop_device+0x5f/0x90 [iwlwifi]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_mvm_stop_device+0x78/0xd0 [iwlmvm]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  __iwl_mvm_mac_start+0x114/0x210 [iwlmvm]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  iwl_mvm_mac_start+0x76/0x150 [iwlmvm]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  drv_start+0x79/0x180 [mac80211]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  ieee80211_reconfig+0x1523/0x1ce0 [mac80211]
kernel:  ? synchronize_net+0x4/0x50
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  ieee80211_restart_work+0x108/0x170 [mac80211]
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  process_one_work+0x250/0x530
kernel:  ? ftrace_regs_caller_end+0x66/0x66
kernel:  worker_thread+0x48/0x3a0
kernel:  ? __pfx_worker_thread+0x10/0x10
kernel:  kthread+0x10f/0x140
kernel:  ? __pfx_kthread+0x10/0x10
kernel:  ret_from_fork+0x29/0x50
kernel:  </TASK>

Fixes: 4444bc2 ("wifi: mac80211: Proper mark iTXQs for resumption")
Link: https://lore.kernel.org/all/[email protected]/
Reported-by: Mirsad Goran Todorovac <[email protected]>
Cc: Gregory Greenman <[email protected]>
Cc: Johannes Berg <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Cc: David S. Miller <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: Alexander Wetzel <[email protected]>
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Reviewed-by: tag, or it goes automatically?
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
z3ntu pushed a commit that referenced this pull request Jun 15, 2023
The cited commit adds a compeletion to remove dependency on rtnl
lock. But it causes a deadlock for multiple encapsulations:

 crash> bt ffff8aece8a64000
 PID: 1514557  TASK: ffff8aece8a64000  CPU: 3    COMMAND: "tc"
  #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418
  #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898
  #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8
  #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb
  #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core]
  #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core]
  #7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core]
  #8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core]
  #9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core]
 #10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core]
 #11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core]
 #12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8
 #13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower]
 #14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower]
 #15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047
 #16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31
 #17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853
 #18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835
 #19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27
 #20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245
 #21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482
 #22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a
 #23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2
 #24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2
 #25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f
 #26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8
 #27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c
 crash> bt 0xffff8aeb07544000
 PID: 1110766  TASK: ffff8aeb07544000  CPU: 0    COMMAND: "kworker/u20:9"
  #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418
  #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88
  #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b
  #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core]
  #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core]
  #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core]
  #7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c
  #8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012
  #9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d
 #10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f

After the first encap is attached, flow will be added to encap
entry's flows list. If neigh update is running at this time, the
following encaps of the flow can't hold the encap_tbl_lock and
sleep. If neigh update thread is waiting for that flow's init_done,
deadlock happens.

Fix it by holding lock outside of the for loop. If neigh update is
running, prevent encap flows from offloading. Since the lock is held
outside of the for loop, concurrent creation of encap entries is not
allowed. So remove unnecessary wait_for_completion call for res_ready.

Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update")
Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
z3ntu pushed a commit that referenced this pull request Jun 15, 2023
A remote DoS vulnerability of RPL Source Routing is assigned CVE-2023-2156.

The Source Routing Header (SRH) has the following format:

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |  Next Header  |  Hdr Ext Len  | Routing Type  | Segments Left |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  | CmprI | CmprE |  Pad  |               Reserved                |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                                                               |
  .                                                               .
  .                        Addresses[1..n]                        .
  .                                                               .
  |                                                               |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The originator of an SRH places the first hop's IPv6 address in the IPv6
header's IPv6 Destination Address and the second hop's IPv6 address as
the first address in Addresses[1..n].

The CmprI and CmprE fields indicate the number of prefix octets that are
shared with the IPv6 Destination Address.  When CmprI or CmprE is not 0,
Addresses[1..n] are compressed as follows:

  1..n-1 : (16 - CmprI) bytes
       n : (16 - CmprE) bytes

Segments Left indicates the number of route segments remaining.  When the
value is not zero, the SRH is forwarded to the next hop.  Its address
is extracted from Addresses[n - Segment Left + 1] and swapped with IPv6
Destination Address.

When Segment Left is greater than or equal to 2, the size of SRH is not
changed because Addresses[1..n-1] are decompressed and recompressed with
CmprI.

OTOH, when Segment Left changes from 1 to 0, the new SRH could have a
different size because Addresses[1..n-1] are decompressed with CmprI and
recompressed with CmprE.

Let's say CmprI is 15 and CmprE is 0.  When we receive SRH with Segment
Left >= 2, Addresses[1..n-1] have 1 byte for each, and Addresses[n] has
16 bytes.  When Segment Left is 1, Addresses[1..n-1] is decompressed to
16 bytes and not recompressed.  Finally, the new SRH will need more room
in the header, and the size is (16 - 1) * (n - 1) bytes.

Here the max value of n is 255 as Segment Left is u8, so in the worst case,
we have to allocate 3825 bytes in the skb headroom.  However, now we only
allocate a small fixed buffer that is IPV6_RPL_SRH_WORST_SWAP_SIZE (16 + 7
bytes).  If the decompressed size overflows the room, skb_push() hits BUG()
below [0].

Instead of allocating the fixed buffer for every packet, let's allocate
enough headroom only when we receive SRH with Segment Left 1.

[0]:
skbuff: skb_under_panic: text:ffffffff81c9f6e2 len:576 put:576 head:ffff8880070b5180 data:ffff8880070b4fb0 tail:0x70 end:0x140 dev:lo
kernel BUG at net/core/skbuff.c:200!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 154 Comm: python3 Not tainted 6.4.0-rc4-00190-gc308e9ec0047 #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:skb_panic (net/core/skbuff.c:200)
Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 80 6e 77 82 e8 ad 8b 60 ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffc90000003da0 EFLAGS: 00000246
RAX: 0000000000000085 RBX: ffff8880058a6600 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88807dc1c540 RDI: ffff88807dc1c540
RBP: ffffc90000003e48 R08: ffffffff82b392c8 R09: 00000000ffffdfff
R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888005b1c800
R13: ffff8880070b51b8 R14: ffff888005b1ca18 R15: ffff8880070b5190
FS:  00007f4539f0b740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055670baf3000 CR3: 0000000005b0e000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
 <IRQ>
 skb_push (net/core/skbuff.c:210)
 ipv6_rthdr_rcv (./include/linux/skbuff.h:2880 net/ipv6/exthdrs.c:634 net/ipv6/exthdrs.c:718)
 ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437 (discriminator 5))
 ip6_input_finish (./include/linux/rcupdate.h:805 net/ipv6/ip6_input.c:483)
 __netif_receive_skb_one_core (net/core/dev.c:5494)
 process_backlog (./include/linux/rcupdate.h:805 net/core/dev.c:5934)
 __napi_poll (net/core/dev.c:6496)
 net_rx_action (net/core/dev.c:6565 net/core/dev.c:6696)
 __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:572)
 do_softirq (kernel/softirq.c:472 kernel/softirq.c:459)
 </IRQ>
 <TASK>
 __local_bh_enable_ip (kernel/softirq.c:396)
 __dev_queue_xmit (net/core/dev.c:4272)
 ip6_finish_output2 (./include/net/neighbour.h:544 net/ipv6/ip6_output.c:134)
 rawv6_sendmsg (./include/net/dst.h:458 ./include/linux/netfilter.h:303 net/ipv6/raw.c:656 net/ipv6/raw.c:914)
 sock_sendmsg (net/socket.c:724 net/socket.c:747)
 __sys_sendto (net/socket.c:2144)
 __x64_sys_sendto (net/socket.c:2156 net/socket.c:2152 net/socket.c:2152)
 do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
RIP: 0033:0x7f453a138aea
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
RSP: 002b:00007ffcc212a1c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007ffcc212a288 RCX: 00007f453a138aea
RDX: 0000000000000060 RSI: 00007f4539084c20 RDI: 0000000000000003
RBP: 00007f4538308e80 R08: 00007ffcc212a300 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f4539712d1b
 </TASK>
Modules linked in:

Fixes: 8610c7c ("net: ipv6: add support for rpl sr exthdr")
Reported-by: Max VA
Closes: https://www.interruptlabs.co.uk/articles/linux-ipv6-route-of-death
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
z3ntu pushed a commit that referenced this pull request Jun 15, 2023
Currently, the per cpu upcall counters are allocated after the vport is
created and inserted into the system. This could lead to the datapath
accessing the counters before they are allocated resulting in a kernel
Oops.

Here is an example:

  PID: 59693    TASK: ffff0005f4f51500  CPU: 0    COMMAND: "ovs-vswitchd"
   #0 [ffff80000a39b5b0] __switch_to at ffffb70f0629f2f4
   #1 [ffff80000a39b5d0] __schedule at ffffb70f0629f5cc
   #2 [ffff80000a39b650] preempt_schedule_common at ffffb70f0629fa60
   #3 [ffff80000a39b670] dynamic_might_resched at ffffb70f0629fb58
   #4 [ffff80000a39b680] mutex_lock_killable at ffffb70f062a1388
   #5 [ffff80000a39b6a0] pcpu_alloc at ffffb70f0594460c
   #6 [ffff80000a39b750] __alloc_percpu_gfp at ffffb70f05944e68
   #7 [ffff80000a39b760] ovs_vport_cmd_new at ffffb70ee6961b90 [openvswitch]
   ...

  PID: 58682    TASK: ffff0005b2f0bf00  CPU: 0    COMMAND: "kworker/0:3"
   #0 [ffff80000a5d2f40] machine_kexec at ffffb70f056a0758
   #1 [ffff80000a5d2f70] __crash_kexec at ffffb70f057e2994
   #2 [ffff80000a5d3100] crash_kexec at ffffb70f057e2ad8
   #3 [ffff80000a5d3120] die at ffffb70f0628234c
   #4 [ffff80000a5d31e0] die_kernel_fault at ffffb70f062828a8
   #5 [ffff80000a5d3210] __do_kernel_fault at ffffb70f056a31f4
   #6 [ffff80000a5d3240] do_bad_area at ffffb70f056a32a4
   #7 [ffff80000a5d3260] do_translation_fault at ffffb70f062a9710
   #8 [ffff80000a5d3270] do_mem_abort at ffffb70f056a2f74
   #9 [ffff80000a5d32a0] el1_abort at ffffb70f06297dac
  #10 [ffff80000a5d32d0] el1h_64_sync_handler at ffffb70f06299b24
  #11 [ffff80000a5d3410] el1h_64_sync at ffffb70f056812dc
  #12 [ffff80000a5d3430] ovs_dp_upcall at ffffb70ee6963c84 [openvswitch]
  #13 [ffff80000a5d3470] ovs_dp_process_packet at ffffb70ee6963fdc [openvswitch]
  #14 [ffff80000a5d34f0] ovs_vport_receive at ffffb70ee6972c78 [openvswitch]
  #15 [ffff80000a5d36f0] netdev_port_receive at ffffb70ee6973948 [openvswitch]
  #16 [ffff80000a5d3720] netdev_frame_hook at ffffb70ee6973a28 [openvswitch]
  #17 [ffff80000a5d3730] __netif_receive_skb_core.constprop.0 at ffffb70f06079f90

We moved the per cpu upcall counter allocation to the existing vport
alloc and free functions to solve this.

Fixes: 95637d9 ("net: openvswitch: release vport resources on failure")
Fixes: 1933ea3 ("net: openvswitch: Add support to count upcall packets")
Signed-off-by: Eelco Chaudron <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Acked-by: Aaron Conole <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
z3ntu pushed a commit that referenced this pull request Apr 5, 2024
commit 4be9075 upstream.

The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the
corresponding ttm_resource_manager is not allocated.
This leads to a crash when trying to read from this file.

Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file
only when the corresponding ttm_resource_manager is allocated.

crash> bt
PID: 3133409  TASK: ffff8fe4834a5000  CPU: 3    COMMAND: "grep"
 #0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3
 #1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a
 #2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1
 #3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1
 #4 [ffffb954506b3c70] no_context at ffffffffb2a7e913
 #5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c
 #6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887
 #7 [ffffb954506b3d40] page_fault at ffffffffb360116e
    [exception RIP: ttm_resource_manager_debug+0x11]
    RIP: ffffffffc04afd11  RSP: ffffb954506b3df0  RFLAGS: 00010246
    RAX: ffff8fe41a6d1200  RBX: 0000000000000000  RCX: 0000000000000940
    RDX: 0000000000000000  RSI: ffffffffc04b4338  RDI: 0000000000000000
    RBP: ffffb954506b3e08   R8: ffff8fee3ffad000   R9: 0000000000000000
    R10: ffff8fe41a76a000  R11: 0000000000000001  R12: 00000000ffffffff
    R13: 0000000000000001  R14: ffff8fe5bb6f3900  R15: ffff8fe41a6d1200
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm]
 #9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3
    RIP: 00007f4c4eda8985  RSP: 00007ffdbba9e9f8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000000000037e000  RCX: 00007f4c4eda8985
    RDX: 000000000037e000  RSI: 00007f4c41573000  RDI: 0000000000000003
    RBP: 000000000037e000   R8: 0000000000000000   R9: 000000000037fe30
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007f4c41573000
    R13: 0000000000000003  R14: 00007f4c41572010  R15: 0000000000000003
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

Signed-off-by: Jocelyn Falempe <[email protected]>
Fixes: af4a25b ("drm/vmwgfx: Add debugfs entries for various ttm resource managers")
Cc: <[email protected]>
Reviewed-by: Zack Rusin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants