Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace mtime counter with realtime timer #49

Merged
merged 1 commit into from
Aug 1, 2024

Conversation

chiangkd
Copy link
Collaborator

Replace mtime counter by retrieving time of the specified clock ID. Also, Real time counter (RTC) mechanism is implemented to prevent mtime overflow.

Before:

[    0.000000] Linux version 6.1.94 (aaron@aaron-Lenovo-Y520-15IKBN) (riscv32-buildroot-linux-gnu-gcc.br_real (Buildroot 2024.02.3) 12.3.0, GNU ld (GNU Binutils) 2.41) #8 SMP Thu Jul 11 00:37:05 CST 2024
[    0.000000] Machine model: semu
[    0.000000] earlycon: ns16550 at MMIO 0xf4000000 (options '')
[    0.000000] printk: bootconsole [ns16550] enabled
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
...
[    2.687661] Freeing initrd memory: 8188K
[    2.691697] Freeing unused kernel image (initmem) memory: 180K
[    2.691978] Kernel memory protection not selected by kernel config.
[    2.692277] Run /init as init process

Now:

[    0.000000] Linux version 6.1.94 (aaron@aaron-Lenovo-Y520-15IKBN) (riscv32-buildroot-linux-gnu-gcc.br_real (Buildroot 2024.02.3) 12.3.0, GNU ld (GNU Binutils) 2.41) #10 SMP Mon Jul 15 02:47:57 CST 2024
[    0.000000] Machine model: semu
[    0.000000] earlycon: ns16550 at MMIO 0xf4000000 (options '')
[    0.000000] printk: bootconsole [ns16550] enabled
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
...
[   73.908933] Freeing initrd memory: 8188K
[   74.033405] Freeing unused kernel image (initmem) memory: 180K
[   74.044397] Kernel memory protection not selected by kernel config.
[   74.064088] Run /init as init process

Revised boot log to display actual timer during boot flow.

Reference to RVVM:https://github.com/LekKit/RVVM/blob/staging/src/rvtimer.c

However, another issue accopanies this PR

[   32.038173] rcu: INFO: rcu_sched self-detected stall on CPU
[   32.043867] rcu: 	3-....: (5248 ticks this GP) idle=846c/1/0x40000002 softirq=371/371 fqs=2114
[   32.055848] 	(t=5251 jiffies g=-883 q=10 ncpus=4)
[   32.064761] CPU: 3 PID: 39 Comm: kworker/u8:1 Not tainted 6.1.94 #10

RCU issues a CPU stall wrning when it waits more than a grace period, which is normally 21 seconds.

Detail here: https://docs.kernel.org/RCU/stallwarn.html

@jserv jserv requested a review from ranvd July 15, 2024 01:02
timer.h Outdated Show resolved Hide resolved
clint.c Outdated
Comment on lines 8 to 10
uint64_t time_delta =
clint->mtimecmp[hart->mhartid] - semu_timer_gettime(&hart->time);
if (time_delta & 0x8000000000000000 || time_delta == 0)
Copy link
Collaborator

@ranvd ranvd Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to cast time_delta to a int64_t instead of unsigned integer? By doing this, the comparison in the if statement can be reduced and more intuitive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, casting time_delta can be more intutive. Below is the refined implementation

uint64_t time_delta =
    clint->mtimecmp[hart->mhartid] - semu_timer_gettime(&hart->time);
if ((int32_t)time_delta <= 0)
...

@ranvd
Copy link
Collaborator

ranvd commented Jul 15, 2024

I believe using a real-time counter is a better approach for simulating the timer device. However, I didn't notice any differences between the "Before" and "Now" sections in your text.

Also, the RCU issue mentioned above is caused by too frequent timer interrupt triggers. Since RVVM is a multi-threaded RISC-V simulator, it's okay to directly simulate the timer as you implemented. However, semu is currently a single-threaded simulator, so SEMU becomes slower as the number of simulated harts grows. Therefore, the frequency at semu_timer_init should be at least ceil(log2(n)) times slower. (n is the number of simulated harts)

@ranvd
Copy link
Collaborator

ranvd commented Jul 15, 2024

I believe using a real-time counter is a better approach for simulating the timer device. However, I didn't notice any differences between the "Before" and "Now" sections in your text.

Sorry I need to take back what I said earlier. The real-time counter may not always be a better implementation. After measuring the boot-up time with SMP=4, the original version takes around 23 seconds, while this commit takes 39 seconds. I think this is because the real-time counter generates more interrupts then the original version.

Does the real-time counter (RTC) provide any features that can't be neglected?

@chiangkd
Copy link
Collaborator Author

Does the real-time counter (RTC) provide any features that can't be neglected?

Initially, my intention was to enable the actual clock for the RDTIME (0xC01/0xC81) pseudo-instruction, which originally retrieves the monotonic-incremented mtime. I did not anticipate that this PR would trigger RCU warnings prior to implementation. Post-implementation, real timestamps are visible in the booting log due to CONFIG_PRINTK_TIME being enabled by default.

SEMU becomes slower as the number of simulated harts grows.

I acknowledge the observation that SEMU slows down as the number of simulated harts increases. While reducing the frequency in semu_timer_init might mitigate this, it would compromise the provision of an actual clock through the TIMER CSR. In my undetstanding, simulating a multi-core system with a single thread inherently reduces efficiency(include much more timer interrupt and much more clock_gettime system call in this PR), leading to RCU warnings.

Since RVVM is a multi-threaded RISC-V simulator, it's okay to directly simulate the timer as you implemented.

I believe implementing multi-threaded simulation for SEMU could provide a better solution to the RCU warning issue, is that make sense?

main.c Outdated Show resolved Hide resolved
Copy link
Collaborator

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Squash the git commits and rebase the latest master branch.

@jserv
Copy link
Collaborator

jserv commented Jul 21, 2024

Properly rebase the latest master branch via the following commands:

$ git remote add upstream https://github.com/sysprog21/semu
$ git fetch upstream 
$ git rebase upstream/master
$ git push --force origin master

Next time, create a dedicated branch rather than master branch when you are going to submit pull requests.

@chiangkd
Copy link
Collaborator Author

Next time, create a dedicated branch rather than master branch when you are going to submit pull requests.

Thank you for the instructions. I understand and will follow these steps and will also make sure to create a dedicated branch for future PR.

Copy link
Collaborator

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve the git commit message. You should explain the rationale.
See https://cbea.ms/git-commit/

utils.h Outdated Show resolved Hide resolved
utils.h Outdated Show resolved Hide resolved
clint.c Outdated
Comment on lines 8 to 10
uint64_t time_delta =
clint->mtimecmp[hart->mhartid] - semu_timer_gettime(&hart->time);
if ((int32_t) time_delta <= 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using int32_t might trigger potential errors since this is a real-time counter. We can't predict when the clint_update_interrupts function will be executed. The time_delta could be a very small negative value, which would normally trigger the timer interrupt. However, if we cast it into a 32-bit integer, the value could incorrectly become positive. For example, if time_delta is 0xFFFFFFFF00000001, which is -4294967295, casting it to int32_t will result in the value becoming 1.

Therefore, I suggest using int64_t instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're absolutely right. When the larger number exceeds the small one by more than 0x100000000, it will incorrectly cast a negative number to a positive one. Thanks for pointing that out!

@chiangkd chiangkd force-pushed the master branch 2 times, most recently from be007bb to 82eb0b5 Compare July 22, 2024 18:08
utils.h Outdated Show resolved Hide resolved
utils.h Show resolved Hide resolved
Copy link
Collaborator

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read https://cbea.ms/git-commit/ carefully and refine your git commit messages.

Wrap the body at 72 characters

Avoid using backticks in git commit messages and comments to ensure compatibility with terminal emulators that may not render the backtick character properly.

Instead, use the pair ' or " (quotation marks).

utils.h Outdated Show resolved Hide resolved
@jserv
Copy link
Collaborator

jserv commented Jul 25, 2024

SEMU experiences a slowdown as the number of simulated harts increases. This is due to the increased number of timer interrupts and clock_gettime system calls in this commit.

Have you tried CLOCK_REALTIME as the clock source?

@chiangkd
Copy link
Collaborator Author

Have you tried CLOCK_REALTIME as the clock source?

Yes, CLOCK_REALTIME perform a similar result for booting system. CLOCK_REALTIME is affected by NTP. In the absence of time adjustments, CLOCK_REALTIME and CLOCK_MONOTIC might show similar time differences.

Execute with CLOCK_REALTIME

[   32.718809] rcu: INFO: rcu_sched self-detected stall on CPU
...
[   76.194389] Run /init as init process

Execute with CLOCK_MONOTIC

[   32.545280] rcu: INFO: rcu_sched self-detected stall on CPU
...
[   74.768311] Run /init as init process

@jserv
Copy link
Collaborator

jserv commented Jul 25, 2024

CLOCK_REALTIME perform a similar result for booting system. CLOCK_REALTIME is affected by NTP. In the absence of time adjustments, CLOCK_REALTIME and CLOCK_MONOTIC might show similar time differences.

How about CLOCK_MONOTONIC_RAW or CLOCK_MONOTONIC_COARSE?
They are implemented as faster but less precise alternatives of CLOCK_MONOTONIC.

@chiangkd
Copy link
Collaborator Author

chiangkd commented Jul 25, 2024

How about CLOCK_MONOTONIC_RAW or CLOCK_MONOTONIC_COARSE?
They are implemented as faster but less precise alternatives of CLOCK_MONOTONIC.

CLOCK_MONOTONIC_COARSE and CLOCK_REALTIME_COARSE can be faster because they use less precise and vDSO mechanism. However, they can still trigger RCU warnings.

Execute with CLOCK_MONOTONIC_COARSE

[   29.840317] rcu: INFO: rcu_sched self-detected stall on CPU
...
[   62.512664] Run /init as init process
...

Execute with CLOCK_REALTIME_COARSE

[   29.979120] rcu: INFO: rcu_sched self-detected stall on CPU
...
[   56.826371] Run /init as init process

Use perf to monitor performance for different clock sources during 100,000,000 calls to clock_gettime().

MONOTONIC REALTIME MONOTONIC_COARSE REALTIME_COARSE
user 1.723s 1.796s 0.556s 0.568s
sys 0s 0.008 0s 0s

@jserv
Copy link
Collaborator

jserv commented Jul 26, 2024

CLOCK_MONOTONIC_COARSE and CLOCK_REALTIME_COARSE can be faster because they use less precise and vDSO mechanism. However, they can still trigger RCU warnings.

Based on the experiments, empirical evidence suggests utilizing CLOCK_MONOTONIC_COARSE, which is available on modern Linux systems. Please submit a GitHub issue regarding RCU warnings once this pull request is merged.

Replace 'mtime' counter by retrieving time of the specified clock ID.
Additionally, a  Real-Time counter (RTC) mechanism is implemented to
prevent 'mtime' overflow.

By changing 'mtime' to use the retrieved timer, real timestamps are
visible in the booting log due to 'CONFIG_PRINTK_TIME' being enable
by default.

However, SEMU experiences a slowdown as the number of simulated harts
increases. This is due to the increased number of timer interrupts
and 'clock_gettime' system calls in this commit.

Another issue addressed is RCU CPU stall warning[1], which occurs
when the system waits more than a grace period (typically 21 seconds).

[1]: https://docs.kernel.org/RCU/stallwarn.html
@jserv jserv merged commit 207a4d7 into sysprog21:master Aug 1, 2024
2 checks passed
@jserv
Copy link
Collaborator

jserv commented Aug 1, 2024

Thank @chiangkd for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants