Compare commits

...

24 Commits
alpha ... dev

Author SHA1 Message Date
chaphidoesstuff 8d6b694569 Update README.md 2024-09-30 12:30:59 +02:00
Crunch (Chaz9) 3aca4a3490 Updated 2024-09-29 21:31:09 +01:00
Crunch (Chaz9) 76f6f8de80 ok 2024-09-29 21:28:35 +01:00
Crunch (Chaz9) 592f93b26c Update Core Timing .h file 2024-09-29 21:23:11 +01:00
chad_mcguffin e5c47e911b Remove unrelated discord link
This doesn't need to be here as it's all Ex developers who are now unrelated to the project
2024-09-20 20:44:32 +02:00
chaphidoesstuff 42ade6f62a need to fix bugs people! 2024-09-17 10:22:27 +02:00
Exverge 66993e2603 Comment out unimplemented check
In my testing on macOS, MK8 sometimes crashed at this function, giving a void type instead of u32.
I've temporarily commented this out until (if) this is implemented and added a check for if it is implemented
2024-09-15 21:37:12 +02:00
chaphidoesstuff 6be886d0ff audio_core: increment current revision, Courtesy of Sudachi Dev
Originally from 39effa1011/src/audio_core/common/feature_support.h# and my mirror
2024-09-15 17:50:09 +02:00
chaphidoesstuff ae65020815 Re-added credit to OG devs 2024-09-15 17:40:10 +02:00
Herman Semenov e886f27816 Using reserve() for optimization inserts, marked unused pair items and minor code refactor 2024-09-15 17:30:44 +02:00
chaphidoesstuff 9490b5264e Corrected Mistake 2024-09-15 17:18:09 +02:00
chaphidoesstuff 5f485a5863 Updated links 2024-09-15 16:41:53 +02:00
Crimson-Hawk 4eb41467f8
correct the false information in readme regarding rewrite 2024-07-04 12:22:04 +08:00
Crimson-Hawk daf2c1f496
fix android build 2024-05-29 17:43:46 +08:00
Crimson-Hawk 5f351bf2b3
remove temp.sh 2024-05-29 17:30:20 +08:00
Crimson-Hawk 7b13512b41
fixed reference to gitlab in ci 2024-05-29 17:23:06 +08:00
Crimson-Hawk e1f809079e
fixed reference to gitlab in ci 2024-05-29 17:14:55 +08:00
Crimson-Hawk b95cfe6483
fixed reference to gitlab in ci 2024-05-29 16:51:35 +08:00
Crimson Hawk 433bcabb72
make pipeline run on every branch 2024-05-29 08:53:17 +08:00
administrator 267ba83d40
Remove unsanctioned Discord invite
Having a Discord server linked to Suyu poses a risk to the accounts of its members. Moreover, many of the members of this server have quit the Suyu project and do not wish to continue its development.
2024-05-28 21:07:57 +08:00
administrator 93b7854f95 Remove unsanctioned Discord invite
Having a Discord server linked to Suyu poses a risk to the accounts of its members. Moreover, many of the members of this server have quit the Suyu project and do not wish to continue its development.
2024-05-21 02:12:22 +02:00
chaphidoesstuff 2bacc25996
Update README.md 2024-05-19 16:11:58 -04:00
chaphidoesstuff 99ead71239
Updated README, fixed links in CONTRIBUTING.md
Co-authored-by: Exverge <exverge@exverge.xyz>
Committed-by: Exverge <exverge@exverge.xyz>
2024-05-19 16:11:58 -04:00
portaldevice e090ec8b21
Add migration instructions for migrating from yuzu (#178)
Co-authored-by: Exverge <exverge@exverge.xyz>
Signed-off-by: Exverge <exverge@exverge.xyz>
2024-05-19 16:03:52 -04:00
46 changed files with 848 additions and 1489 deletions

View File

@ -7,6 +7,8 @@
export NDK_CCACHE="$(which ccache)" export NDK_CCACHE="$(which ccache)"
ccache -s ccache -s
git submodule update --init --recursive
BUILD_FLAVOR="mainline" BUILD_FLAVOR="mainline"
BUILD_TYPE="release" BUILD_TYPE="release"

View File

@ -7,7 +7,9 @@
# Exit on error, rather than continuing with the rest of the script. # Exit on error, rather than continuing with the rest of the script.
set -e set -e
ccache -sv ccache -s
git submodule update --init --recursive
mkdir build || true && cd build mkdir build || true && cd build
cmake .. \ cmake .. \

View File

@ -6,7 +6,9 @@
# Exit on error, rather than continuing with the rest of the script. # Exit on error, rather than continuing with the rest of the script.
set -e set -e
ccache -sv ccache -s
git submodule update --init --recursive
mkdir build || true && cd build mkdir build || true && cd build
cmake .. \ cmake .. \
@ -52,9 +54,9 @@ DESTDIR="$PWD/AppDir" ninja install
rm -vf AppDir/usr/bin/suyu-cmd AppDir/usr/bin/suyu-tester rm -vf AppDir/usr/bin/suyu-cmd AppDir/usr/bin/suyu-tester
# Download tools needed to build an AppImage # Download tools needed to build an AppImage
wget -nc https://gitlab.com/suyu-emu/ext-linux-bin/-/raw/main/appimage/deploy-linux.sh wget -nc https://git.suyu.dev/suyu/ext-linux-bin/raw/branch/main/appimage/deploy-linux.sh
wget -nc https://gitlab.com/suyu-emu/ext-linux-bin/-/raw/main/appimage/exec-x86_64.so wget -nc https://git.suyu.dev/suyu/ext-linux-bin/raw/branch/main/appimage/exec-x86_64.so
wget -nc https://gitlab.com/suyu-emu/AppImageKit-checkrt/-/raw/old/AppRun.sh wget -nc https://git.suyu.dev/suyu/AppImageKit-checkrt/raw/branch/gh-workflow/AppRun
# Set executable bit # Set executable bit
chmod 755 \ chmod 755 \

View File

@ -8,7 +8,9 @@ set -e
#cd /suyu #cd /suyu
ccache -sv ccache -s
git submodule update --init --recursive
rm -rf build rm -rf build
mkdir -p build && cd build mkdir -p build && cd build

View File

@ -8,7 +8,7 @@ name: 'suyu verify'
on: on:
pull_request: pull_request:
branches: [ "dev" ] # branches: [ "dev" ]
paths: paths:
- 'src/**' - 'src/**'
- 'CMakeModules/**' - 'CMakeModules/**'
@ -19,7 +19,7 @@ on:
# paths-ignore: # paths-ignore:
# - 'src/android/**' # - 'src/android/**'
push: push:
branches: [ "dev" ] # branches: [ "dev" ]
paths: paths:
- 'src/**' - 'src/**'
- 'CMakeModules/**' - 'CMakeModules/**'

View File

@ -6,5 +6,5 @@ SPDX-License-Identifier: GPL-2.0-or-later
Please check out the Please check out the
* [Contributors's guide](https://gitlab.com/suyu-emu/suyu/-/wikis/Contributing). * [Contributors's guide](https://git.suyu.dev/suyu/suyu/wiki/Contributing).
* [Merge request guidelines](https://gitlab.com/suyu-emu/suyu/-/wikis/Merge-requests) * [Merge request guidelines](https://git.suyu.dev/suyu/suyu/wiki/Typical-Git-Workflow#once-your-pull-request-is-ready-to-be-merged)

25
MIGRATION.md Normal file
View File

@ -0,0 +1,25 @@
<!--
SPDX-FileCopyrightText: 2024 suyu Emulator Project
SPDX-License-Identifier: GPL-3.0-or-later
-->
# Migrating from yuzu
When coming from yuzu, the migration is as easy as renaming some directories.
## Windows
Use the run dialog to go to `%APPDATA%` or manually go to `C:\Users\{USERNAME}\AppData\Roaming` (you may have to enable hidden files) and simply rename the `yuzu` directories and simply rename those to `suyu`.
## Unix (macOS/Linux)
Similarly, you can simply rename the folders `~/.local/share/yuzu` and `~/.config/yuzu` to `suyu`, either via a file manager or with the following commands:
```sh
$ mv ~/.local/share/yuzu ~/.local/share/suyu
$ mv ~/.config/yuzu ~/.config/suyu
```
There is also `~/.cache/yuzu`, which you can safely delete. Suyu will build a fresh cache in its own directory.
### Linux
Depending on your setup, you may want to substitute those base paths for `$XDG_DATA_HOME` and `$XDG_CONFIG_HOME` respectively.
## Android
TBD

View File

@ -6,10 +6,10 @@ SPDX-License-Identifier: GPL-3.0-or-later
**Note**: We do not support or condone piracy in any form. In order to use suyu, you'll need keys from your real Switch system, and games which you have legally obtained and paid for. We do not intend to make money or profit from this project. **Note**: We do not support or condone piracy in any form. In order to use suyu, you'll need keys from your real Switch system, and games which you have legally obtained and paid for. We do not intend to make money or profit from this project.
We're in need of developers. Please join our chat below if you want to contribute! We're in need of developers. Please join our chat below or DM a dev if you want to contribute!
This repo was based on Yuzu EA 4176 but the code is being rewritten from the ground up for legal and performance reasons. This repo is currently based on Yuzu EA 4176 but the code will be rewritten for legal and performance reasons.
Support the original suyu developer team [here](https://discord.gg/ajz5hdrZ) Our only website is suyu.dev so please be cautious when using other sites offering builds/downloads.
<hr /> <hr />
@ -23,12 +23,13 @@ Support the original suyu developer team [here](https://discord.gg/ajz5hdrZ)
<h4 align="center"><b>suyu</b> was the continuation of the world's most popular, open-source Nintendo Switch emulator, yuzu, but is now something more. <h4 align="center"><b>suyu</b> was the continuation of the world's most popular, open-source Nintendo Switch emulator, yuzu, but is now something more.
<br> <br>
It is written in C++ (C# possibly required soon) with portability in mind, we actively work on builds for Windows, Linux, Android and hopefully IOS, along with a WIP custom OS called suyuOS (https://git.suyu.dev/suyu/suyu-os) . It is written in C++ with portability in mind, and we actively provide builds for Windows, Linux and Android, iOS may come later.
</h4> </h4>
<p align="center"> <p align="center">
<a href="https://chat.suyu.dev">Chat</a> | <a href="https://chat.suyu.dev">Chat</a> |
<a href="https://www.reddit.com/r/suyu/">Reddit</a> |
<a href="#status">Status</a> | <a href="#status">Status</a> |
<a href="#development">Development</a> | <a href="#development">Development</a> |
<a href="#downloads">Downloads</a> | <a href="#downloads">Downloads</a> |
@ -41,6 +42,10 @@ It is written in C++ (C# possibly required soon) with portability in mind, we ac
## Hardware Requirements ## Hardware Requirements
[Click here to see the Hardware Requirements](https://git.suyu.dev/suyu/suyu/wiki/Hardware-Requirements) [Click here to see the Hardware Requirements](https://git.suyu.dev/suyu/suyu/wiki/Hardware-Requirements)
## Migrating from yuzu
See [MIGRATION.md](MIGRATION.md).
## Status ## Status
We currently have builds over at the [Releases](https://git.suyu.dev/suyu/suyu/releases) page. We currently have builds over at the [Releases](https://git.suyu.dev/suyu/suyu/releases) page.
@ -51,10 +56,10 @@ We currently have builds over at the [Releases](https://git.suyu.dev/suyu/suyu/r
This project is completely free and open source, and anyone can contribute to help improve suyu. This project is completely free and open source, and anyone can contribute to help improve suyu.
Most of the development happens on GitLab. For development discussion, please join us in our [Chat](https://chat.suyu.dev). Most of the development happens on Git. For development discussion, please join us in our [Chat](https://chat.suyu.dev) or [Subreddit](reddit.com/r/suyu/), you can also contact a developer.
If you want to contribute, please take a look at the [Contributor's Guide](https://git.suyu.dev/suyu/suyu/wiki/Contributing) and [Developer Information](https://git.suyu.dev/suyu/suyu/wiki/Developer-Information). If you want to contribute, please take a look at the [Contributor's Guide](https://git.suyu.dev/suyu/suyu/wiki/Contributing) and [Developer Information](https://git.suyu.dev/suyu/suyu/wiki/Developer-Information).
You can also contact any of the developers on Discord to learn more about the current state of suyu. You can also contact any of the developers on the Chat to learn more about the current state of suyu.
## Downloads ## Downloads
@ -62,26 +67,27 @@ You can also contact any of the developers on Discord to learn more about the cu
* __Linux__: [Releases](https://git.suyu.dev/suyu/suyu/releases) * __Linux__: [Releases](https://git.suyu.dev/suyu/suyu/releases)
* __macOS__: [Releases](https://git.suyu.dev/suyu/suyu/releases) * __macOS__: [Releases](https://git.suyu.dev/suyu/suyu/releases)
* __Android__: [Releases](https://git.suyu.dev/suyu/suyu/releases) * __Android__: [Releases](https://git.suyu.dev/suyu/suyu/releases)
* __For IOS users, we recommend Sudachi__: [Releases](https://github.com/emuPlace/Sudachi/releases) ###### We currently do not provide builds for iOS, however if you would like, you could try the experimental Sudachi Emulator and it's bigger project: [Folium](https://apps.apple.com/us/app/folium/id6498623389).
If you want daily builds then [Click here](https://git.suyu.dev/suyu/suyu/actions) If you want daily builds then [Click here](https://git.suyu.dev/suyu/suyu/actions).
If you don't know how to download the daily builds then [Click here](https://git.suyu.dev/suyu/suyu/raw/branch/dev/img/daily-builds.png) If you don't know how to download the daily builds then [Click here](https://git.suyu.dev/suyu/suyu/raw/branch/dev/img/daily-builds.png)
Right now we only have daily builds for Linux and Android.
We have official builds [here.](https://git.suyu.dev/suyu/suyu/releases) If any website or person is claiming to have a build for suyu, take that with a grain of salt. We have official builds [here.](https://git.suyu.dev/suyu/suyu/releases)<br>If any website or person is claiming to have a build for suyu, take that with a grain of salt and let us know.
For Multiplayer, we recommend using the "Yuzu Online" patch, install instructions can be found on Reddit and their Discord.
## Building ## Building
* __Windows__: [Windows Build](https://git.suyu.dev/suyu/suyu/wiki/Building-For-Windows) * __Windows__: [Windows Build](https://git.suyu.dev/suyu/suyu/wiki/Building-For-Windows)
* __Linux__: [Linux Build](https://git.suyu.dev/suyu/suyu/wiki/Building-For-Linux) * __Linux__: [Linux Build](https://git.suyu.dev/suyu/suyu/wiki/Building-For-Linux)
* __Android__: [Android Build](https://git.suyu.dev/suyu/suyu/wiki/Building-For-Android) * __Android__: [Android Build](https://git.suyu.dev/suyu/suyu/wiki/Building-For-Android)
* __macOS__: [macOS Build](https://git.suyu.dev/suyu/suyu/wiki/Building-for-macOS) * __MacOS__: [MacOS Build](https://git.suyu.dev/suyu/suyu/wiki/Building-for-macOS)
## Support ## Support
If you have any questions, don't hesitate to ask us in our [chat](https://chat.suyu.dev). We don't bite! If you have any questions, don't hesitate to ask us in our [Chat](https://chat.suyu.dev) or Subreddit, make an issue or contact a developer. We don't bite!
## License ## License

BIN
img/need to fix bugs.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 249 KiB

View File

@ -13,7 +13,7 @@
#include "common/polyfill_ranges.h" #include "common/polyfill_ranges.h"
namespace AudioCore { namespace AudioCore {
constexpr u32 CurrentRevision = 11; constexpr u32 CurrentRevision = 12;
enum class SupportTags { enum class SupportTags {
CommandProcessingTimeEstimatorVersion4, CommandProcessingTimeEstimatorVersion4,

View File

@ -54,7 +54,8 @@ public:
const s32 to_register{std::min(std::min(appended_count, BufferAppendLimit), const s32 to_register{std::min(std::min(appended_count, BufferAppendLimit),
BufferAppendLimit - registered_count)}; BufferAppendLimit - registered_count)};
for (s32 i = 0; i < to_register; i++) { out_buffers.reserve(to_register);
for (s32 i = 0; i < to_register; ++i) {
s32 index{appended_index - appended_count}; s32 index{appended_index - appended_count};
if (index < 0) { if (index < 0) {
index += N; index += N;
@ -180,6 +181,7 @@ public:
return 0; return 0;
} }
buffers_flushed.reserve(registered_count + appended_count);
while (registered_count > 0) { while (registered_count > 0) {
auto index{registered_index - registered_count}; auto index{registered_index - registered_count};
if (index < 0) { if (index < 0) {

View File

@ -80,6 +80,7 @@ FileSys::VirtualFile GetGameFileFromPath(const FileSys::VirtualFilesystem& vfs,
if (filename == "00") { if (filename == "00") {
const auto dir = vfs->OpenDirectory(dir_name, FileSys::OpenMode::Read); const auto dir = vfs->OpenDirectory(dir_name, FileSys::OpenMode::Read);
std::vector<FileSys::VirtualFile> concat; std::vector<FileSys::VirtualFile> concat;
concat.reserve(0x10);
for (u32 i = 0; i < 0x10; ++i) { for (u32 i = 0; i < 0x10; ++i) {
const auto file_name = fmt::format("{:02X}", i); const auto file_name = fmt::format("{:02X}", i);

View File

@ -26,24 +26,6 @@ std::shared_ptr<EventType> CreateEvent(std::string name, TimedCallback&& callbac
return std::make_shared<EventType>(std::move(callback), std::move(name)); return std::make_shared<EventType>(std::move(callback), std::move(name));
} }
struct CoreTiming::Event {
s64 time;
u64 fifo_order;
std::weak_ptr<EventType> type;
s64 reschedule_time;
heap_t::handle_type handle{};
// Sort by time, unless the times are the same, in which case sort by
// the order added to the queue
friend bool operator>(const Event& left, const Event& right) {
return std::tie(left.time, left.fifo_order) > std::tie(right.time, right.fifo_order);
}
friend bool operator<(const Event& left, const Event& right) {
return std::tie(left.time, left.fifo_order) < std::tie(right.time, right.fifo_order);
}
};
CoreTiming::CoreTiming() : clock{Common::CreateOptimalClock()} {} CoreTiming::CoreTiming() : clock{Common::CreateOptimalClock()} {}
CoreTiming::~CoreTiming() { CoreTiming::~CoreTiming() {
@ -87,7 +69,7 @@ void CoreTiming::Pause(bool is_paused) {
} }
void CoreTiming::SyncPause(bool is_paused) { void CoreTiming::SyncPause(bool is_paused) {
if (is_paused == paused && paused_set == paused) { if (is_paused == paused && paused_set == is_paused) {
return; return;
} }
@ -112,7 +94,7 @@ bool CoreTiming::IsRunning() const {
bool CoreTiming::HasPendingEvents() const { bool CoreTiming::HasPendingEvents() const {
std::scoped_lock lock{basic_lock}; std::scoped_lock lock{basic_lock};
return !(wait_set && event_queue.empty()); return !event_queue.empty();
} }
void CoreTiming::ScheduleEvent(std::chrono::nanoseconds ns_into_future, void CoreTiming::ScheduleEvent(std::chrono::nanoseconds ns_into_future,
@ -121,8 +103,8 @@ void CoreTiming::ScheduleEvent(std::chrono::nanoseconds ns_into_future,
std::scoped_lock scope{basic_lock}; std::scoped_lock scope{basic_lock};
const auto next_time{absolute_time ? ns_into_future : GetGlobalTimeNs() + ns_into_future}; const auto next_time{absolute_time ? ns_into_future : GetGlobalTimeNs() + ns_into_future};
auto h{event_queue.emplace(Event{next_time.count(), event_fifo_id++, event_type, 0})}; event_queue.emplace_back(Event{next_time.count(), event_fifo_id++, event_type});
(*h).handle = h; std::push_heap(event_queue.begin(), event_queue.end(), std::greater<>());
} }
event.Set(); event.Set();
@ -136,9 +118,9 @@ void CoreTiming::ScheduleLoopingEvent(std::chrono::nanoseconds start_time,
std::scoped_lock scope{basic_lock}; std::scoped_lock scope{basic_lock};
const auto next_time{absolute_time ? start_time : GetGlobalTimeNs() + start_time}; const auto next_time{absolute_time ? start_time : GetGlobalTimeNs() + start_time};
auto h{event_queue.emplace( event_queue.emplace_back(
Event{next_time.count(), event_fifo_id++, event_type, resched_time.count()})}; Event{next_time.count(), event_fifo_id++, event_type, resched_time.count()});
(*h).handle = h; std::push_heap(event_queue.begin(), event_queue.end(), std::greater<>());
} }
event.Set(); event.Set();
@ -149,17 +131,11 @@ void CoreTiming::UnscheduleEvent(const std::shared_ptr<EventType>& event_type,
{ {
std::scoped_lock lk{basic_lock}; std::scoped_lock lk{basic_lock};
std::vector<heap_t::handle_type> to_remove; event_queue.erase(
for (auto itr = event_queue.begin(); itr != event_queue.end(); itr++) { std::remove_if(event_queue.begin(), event_queue.end(),
const Event& e = *itr; [&](const Event& e) { return e.type.lock().get() == event_type.get(); }),
if (e.type.lock().get() == event_type.get()) { event_queue.end());
to_remove.push_back(itr->handle); std::make_heap(event_queue.begin(), event_queue.end(), std::greater<>());
}
}
for (auto& h : to_remove) {
event_queue.erase(h);
}
event_type->sequence_number++; event_type->sequence_number++;
} }
@ -172,7 +148,7 @@ void CoreTiming::UnscheduleEvent(const std::shared_ptr<EventType>& event_type,
void CoreTiming::AddTicks(u64 ticks_to_add) { void CoreTiming::AddTicks(u64 ticks_to_add) {
cpu_ticks += ticks_to_add; cpu_ticks += ticks_to_add;
downcount -= static_cast<s64>(cpu_ticks); downcount -= static_cast<s64>(ticks_to_add);
} }
void CoreTiming::Idle() { void CoreTiming::Idle() {
@ -180,7 +156,7 @@ void CoreTiming::Idle() {
} }
void CoreTiming::ResetTicks() { void CoreTiming::ResetTicks() {
downcount = MAX_SLICE_LENGTH; downcount.store(MAX_SLICE_LENGTH, std::memory_order_release);
} }
u64 CoreTiming::GetClockTicks() const { u64 CoreTiming::GetClockTicks() const {
@ -201,48 +177,38 @@ std::optional<s64> CoreTiming::Advance() {
std::scoped_lock lock{advance_lock, basic_lock}; std::scoped_lock lock{advance_lock, basic_lock};
global_timer = GetGlobalTimeNs().count(); global_timer = GetGlobalTimeNs().count();
while (!event_queue.empty() && event_queue.top().time <= global_timer) { while (!event_queue.empty() && event_queue.front().time <= global_timer) {
const Event& evt = event_queue.top(); Event evt = std::move(event_queue.front());
std::pop_heap(event_queue.begin(), event_queue.end(), std::greater<>());
event_queue.pop_back();
if (const auto event_type{evt.type.lock()}) { if (const auto event_type = evt.type.lock()) {
const auto evt_time = evt.time; const auto evt_time = evt.time;
const auto evt_sequence_num = event_type->sequence_number; const auto evt_sequence_num = event_type->sequence_number;
if (evt.reschedule_time == 0) {
event_queue.pop();
basic_lock.unlock(); basic_lock.unlock();
event_type->callback( const auto new_schedule_time = event_type->callback(
evt_time, std::chrono::nanoseconds{GetGlobalTimeNs().count() - evt_time}); evt_time, std::chrono::nanoseconds{GetGlobalTimeNs().count() - evt_time});
basic_lock.lock(); basic_lock.lock();
} else {
basic_lock.unlock();
const auto new_schedule_time{event_type->callback(
evt_time, std::chrono::nanoseconds{GetGlobalTimeNs().count() - evt_time})};
basic_lock.lock();
if (evt_sequence_num != event_type->sequence_number) { if (evt_sequence_num != event_type->sequence_number) {
// Heap handle is invalidated after external modification.
continue; continue;
} }
const auto next_schedule_time{new_schedule_time.has_value() if (new_schedule_time.has_value() || evt.reschedule_time != 0) {
? new_schedule_time.value().count() const auto next_schedule_time = new_schedule_time.value_or(
: evt.reschedule_time}; std::chrono::nanoseconds{evt.reschedule_time});
// If this event was scheduled into a pause, its time now is going to be way auto next_time = evt.time + next_schedule_time.count();
// behind. Re-set this event to continue from the end of the pause.
auto next_time{evt.time + next_schedule_time};
if (evt.time < pause_end_time) { if (evt.time < pause_end_time) {
next_time = pause_end_time + next_schedule_time; next_time = pause_end_time + next_schedule_time.count();
} }
event_queue.update(evt.handle, Event{next_time, event_fifo_id++, evt.type, event_queue.emplace_back(Event{next_time, event_fifo_id++, evt.type,
next_schedule_time, evt.handle}); next_schedule_time.count()});
std::push_heap(event_queue.begin(), event_queue.end(), std::greater<>());
} }
} }
@ -250,7 +216,7 @@ std::optional<s64> CoreTiming::Advance() {
} }
if (!event_queue.empty()) { if (!event_queue.empty()) {
return event_queue.top().time; return event_queue.front().time;
} else { } else {
return std::nullopt; return std::nullopt;
} }
@ -269,7 +235,7 @@ void CoreTiming::ThreadLoop() {
#ifdef _WIN32 #ifdef _WIN32
while (!paused && !event.IsSet() && wait_time > 0) { while (!paused && !event.IsSet() && wait_time > 0) {
wait_time = *next_time - GetGlobalTimeNs().count(); wait_time = *next_time - GetGlobalTimeNs().count();
if (wait_time >= timer_resolution_ns) { if (wait_time >= 1'000'000) { // 1ms
Common::Windows::SleepForOneTick(); Common::Windows::SleepForOneTick();
} else { } else {
#ifdef ARCHITECTURE_x86_64 #ifdef ARCHITECTURE_x86_64
@ -290,10 +256,8 @@ void CoreTiming::ThreadLoop() {
} else { } else {
// Queue is empty, wait until another event is scheduled and signals us to // Queue is empty, wait until another event is scheduled and signals us to
// continue. // continue.
wait_set = true;
event.Wait(); event.Wait();
} }
wait_set = false;
} }
paused_set = true; paused_set = true;
@ -327,10 +291,4 @@ std::chrono::microseconds CoreTiming::GetGlobalTimeUs() const {
return std::chrono::microseconds{Common::WallClock::CPUTickToUS(cpu_ticks)}; return std::chrono::microseconds{Common::WallClock::CPUTickToUS(cpu_ticks)};
} }
#ifdef _WIN32
void CoreTiming::SetTimerResolutionNs(std::chrono::nanoseconds ns) {
timer_resolution_ns = ns.count();
}
#endif
} // namespace Core::Timing } // namespace Core::Timing

View File

@ -11,8 +11,7 @@
#include <optional> #include <optional>
#include <string> #include <string>
#include <thread> #include <thread>
#include <vector>
#include <boost/heap/fibonacci_heap.hpp>
#include "common/common_types.h" #include "common/common_types.h"
#include "common/thread.h" #include "common/thread.h"
@ -43,18 +42,6 @@ enum class UnscheduleEventType {
NoWait, NoWait,
}; };
/**
* This is a system to schedule events into the emulated machine's future. Time is measured
* in main CPU clock cycles.
*
* To schedule an event, you first have to register its type. This is where you pass in the
* callback. You then schedule events using the type ID you get back.
*
* The s64 ns_late that the callbacks get is how many ns late it was.
* So to schedule a new event on a regular basis:
* inside callback:
* ScheduleEvent(period_in_ns - ns_late, callback, "whatever")
*/
class CoreTiming { class CoreTiming {
public: public:
CoreTiming(); CoreTiming();
@ -66,99 +53,56 @@ public:
CoreTiming& operator=(const CoreTiming&) = delete; CoreTiming& operator=(const CoreTiming&) = delete;
CoreTiming& operator=(CoreTiming&&) = delete; CoreTiming& operator=(CoreTiming&&) = delete;
/// CoreTiming begins at the boundary of timing slice -1. An initial call to Advance() is
/// required to end slice - 1 and start slice 0 before the first cycle of code is executed.
void Initialize(std::function<void()>&& on_thread_init_); void Initialize(std::function<void()>&& on_thread_init_);
/// Clear all pending events. This should ONLY be done on exit.
void ClearPendingEvents(); void ClearPendingEvents();
/// Sets if emulation is multicore or single core, must be set before Initialize
void SetMulticore(bool is_multicore_) { void SetMulticore(bool is_multicore_) {
is_multicore = is_multicore_; is_multicore = is_multicore_;
} }
/// Pauses/Unpauses the execution of the timer thread.
void Pause(bool is_paused); void Pause(bool is_paused);
/// Pauses/Unpauses the execution of the timer thread and waits until paused.
void SyncPause(bool is_paused); void SyncPause(bool is_paused);
/// Checks if core timing is running.
bool IsRunning() const; bool IsRunning() const;
/// Checks if the timer thread has started.
bool HasStarted() const { bool HasStarted() const {
return has_started; return has_started;
} }
/// Checks if there are any pending time events.
bool HasPendingEvents() const; bool HasPendingEvents() const;
/// Schedules an event in core timing
void ScheduleEvent(std::chrono::nanoseconds ns_into_future, void ScheduleEvent(std::chrono::nanoseconds ns_into_future,
const std::shared_ptr<EventType>& event_type, bool absolute_time = false); const std::shared_ptr<EventType>& event_type, bool absolute_time = false);
/// Schedules an event which will automatically re-schedule itself with the given time, until
/// unscheduled
void ScheduleLoopingEvent(std::chrono::nanoseconds start_time, void ScheduleLoopingEvent(std::chrono::nanoseconds start_time,
std::chrono::nanoseconds resched_time, std::chrono::nanoseconds resched_time,
const std::shared_ptr<EventType>& event_type, const std::shared_ptr<EventType>& event_type,
bool absolute_time = false); bool absolute_time = false);
void UnscheduleEvent(const std::shared_ptr<EventType>& event_type, void UnscheduleEvent(const std::shared_ptr<EventType>& event_type,
UnscheduleEventType type = UnscheduleEventType::Wait); UnscheduleEventType type = UnscheduleEventType::Wait);
void AddTicks(u64 ticks_to_add); void AddTicks(u64 ticks_to_add);
void ResetTicks(); void ResetTicks();
void Idle(); void Idle();
s64 GetDowncount() const { s64 GetDowncount() const {
return downcount; return downcount.load(std::memory_order_relaxed);
} }
/// Returns the current CNTPCT tick value.
u64 GetClockTicks() const; u64 GetClockTicks() const;
/// Returns the current GPU tick value.
u64 GetGPUTicks() const; u64 GetGPUTicks() const;
/// Returns current time in microseconds.
std::chrono::microseconds GetGlobalTimeUs() const; std::chrono::microseconds GetGlobalTimeUs() const;
/// Returns current time in nanoseconds.
std::chrono::nanoseconds GetGlobalTimeNs() const; std::chrono::nanoseconds GetGlobalTimeNs() const;
/// Checks for events manually and returns time in nanoseconds for next event, threadsafe.
std::optional<s64> Advance(); std::optional<s64> Advance();
#ifdef _WIN32
void SetTimerResolutionNs(std::chrono::nanoseconds ns);
#endif
private: private:
struct Event; struct Event {
s64 time;
u64 fifo_order;
std::shared_ptr<EventType> type;
bool operator>(const Event& other) const {
return std::tie(time, fifo_order) > std::tie(other.time, other.fifo_order);
}
};
static void ThreadEntry(CoreTiming& instance); static void ThreadEntry(CoreTiming& instance);
void ThreadLoop(); void ThreadLoop();
void Reset(); void Reset();
std::unique_ptr<Common::WallClock> clock; std::unique_ptr<Common::WallClock> clock;
std::atomic<s64> global_timer{0};
s64 global_timer = 0; std::vector<Event> event_queue;
std::atomic<u64> event_fifo_id{0};
#ifdef _WIN32
s64 timer_resolution_ns;
#endif
using heap_t =
boost::heap::fibonacci_heap<CoreTiming::Event, boost::heap::compare<std::greater<>>>;
heap_t event_queue;
u64 event_fifo_id = 0;
Common::Event event{}; Common::Event event{};
Common::Event pause_event{}; Common::Event pause_event{};
@ -173,20 +117,12 @@ private:
std::function<void()> on_thread_init{}; std::function<void()> on_thread_init{};
bool is_multicore{}; bool is_multicore{};
s64 pause_end_time{}; std::atomic<s64> pause_end_time{};
/// Cycle timing std::atomic<u64> cpu_ticks{};
u64 cpu_ticks{}; std::atomic<s64> downcount{};
s64 downcount{};
}; };
/// Creates a core timing event with the given name and callback.
///
/// @param name The name of the core timing event to create.
/// @param callback The callback to execute for the event.
///
/// @returns An EventType instance representing the created event.
///
std::shared_ptr<EventType> CreateEvent(std::string name, TimedCallback&& callback); std::shared_ptr<EventType> CreateEvent(std::string name, TimedCallback&& callback);
} // namespace Core::Timing } // namespace Core::Timing

View File

@ -1,6 +1,12 @@
// SPDX-FileCopyrightText: Copyright 2018 yuzu Emulator Project // SPDX-FileCopyrightText: Copyright 2018 yuzu Emulator Project
// SPDX-License-Identifier: GPL-2.0-or-later // SPDX-License-Identifier: GPL-2.0-or-later
#include <algorithm>
#include <atomic>
#include <memory>
#include <thread>
#include <vector>
#include "common/fiber.h" #include "common/fiber.h"
#include "common/microprofile.h" #include "common/microprofile.h"
#include "common/scope_exit.h" #include "common/scope_exit.h"
@ -24,6 +30,7 @@ void CpuManager::Initialize() {
num_cores = is_multicore ? Core::Hardware::NUM_CPU_CORES : 1; num_cores = is_multicore ? Core::Hardware::NUM_CPU_CORES : 1;
gpu_barrier = std::make_unique<Common::Barrier>(num_cores + 1); gpu_barrier = std::make_unique<Common::Barrier>(num_cores + 1);
core_data.resize(num_cores);
for (std::size_t core = 0; core < num_cores; core++) { for (std::size_t core = 0; core < num_cores; core++) {
core_data[core].host_thread = core_data[core].host_thread =
std::jthread([this, core](std::stop_token token) { RunThread(token, core); }); std::jthread([this, core](std::stop_token token) { RunThread(token, core); });
@ -31,10 +38,10 @@ void CpuManager::Initialize() {
} }
void CpuManager::Shutdown() { void CpuManager::Shutdown() {
for (std::size_t core = 0; core < num_cores; core++) { for (auto& data : core_data) {
if (core_data[core].host_thread.joinable()) { if (data.host_thread.joinable()) {
core_data[core].host_thread.request_stop(); data.host_thread.request_stop();
core_data[core].host_thread.join(); data.host_thread.join();
} }
} }
} }
@ -66,12 +73,7 @@ void CpuManager::HandleInterrupt() {
Kernel::KInterruptManager::HandleInterrupt(kernel, static_cast<s32>(core_index)); Kernel::KInterruptManager::HandleInterrupt(kernel, static_cast<s32>(core_index));
} }
///////////////////////////////////////////////////////////////////////////////
/// MultiCore ///
///////////////////////////////////////////////////////////////////////////////
void CpuManager::MultiCoreRunGuestThread() { void CpuManager::MultiCoreRunGuestThread() {
// Similar to UserModeThreadStarter in HOS
auto& kernel = system.Kernel(); auto& kernel = system.Kernel();
auto* thread = Kernel::GetCurrentThreadPointer(kernel); auto* thread = Kernel::GetCurrentThreadPointer(kernel);
kernel.CurrentScheduler()->OnThreadStart(); kernel.CurrentScheduler()->OnThreadStart();
@ -88,10 +90,6 @@ void CpuManager::MultiCoreRunGuestThread() {
} }
void CpuManager::MultiCoreRunIdleThread() { void CpuManager::MultiCoreRunIdleThread() {
// Not accurate to HOS. Remove this entire method when singlecore is removed.
// See notes in KScheduler::ScheduleImpl for more information about why this
// is inaccurate.
auto& kernel = system.Kernel(); auto& kernel = system.Kernel();
kernel.CurrentScheduler()->OnThreadStart(); kernel.CurrentScheduler()->OnThreadStart();
@ -105,10 +103,6 @@ void CpuManager::MultiCoreRunIdleThread() {
} }
} }
///////////////////////////////////////////////////////////////////////////////
/// SingleCore ///
///////////////////////////////////////////////////////////////////////////////
void CpuManager::SingleCoreRunGuestThread() { void CpuManager::SingleCoreRunGuestThread() {
auto& kernel = system.Kernel(); auto& kernel = system.Kernel();
auto* thread = Kernel::GetCurrentThreadPointer(kernel); auto* thread = Kernel::GetCurrentThreadPointer(kernel);
@ -154,19 +148,16 @@ void CpuManager::PreemptSingleCore(bool from_running_environment) {
system.CoreTiming().Advance(); system.CoreTiming().Advance();
kernel.SetIsPhantomModeForSingleCore(false); kernel.SetIsPhantomModeForSingleCore(false);
} }
current_core.store((current_core + 1) % Core::Hardware::NUM_CPU_CORES); current_core.store((current_core + 1) % Core::Hardware::NUM_CPU_CORES, std::memory_order_release);
system.CoreTiming().ResetTicks(); system.CoreTiming().ResetTicks();
kernel.Scheduler(current_core).PreemptSingleCore(); kernel.Scheduler(current_core).PreemptSingleCore();
// We've now been scheduled again, and we may have exchanged schedulers.
// Reload the scheduler in case it's different.
if (!kernel.Scheduler(current_core).IsIdle()) { if (!kernel.Scheduler(current_core).IsIdle()) {
idle_count = 0; idle_count = 0;
} }
} }
void CpuManager::GuestActivate() { void CpuManager::GuestActivate() {
// Similar to the HorizonKernelMain callback in HOS
auto& kernel = system.Kernel(); auto& kernel = system.Kernel();
auto* scheduler = kernel.CurrentScheduler(); auto* scheduler = kernel.CurrentScheduler();
@ -184,27 +175,19 @@ void CpuManager::ShutdownThread() {
} }
void CpuManager::RunThread(std::stop_token token, std::size_t core) { void CpuManager::RunThread(std::stop_token token, std::size_t core) {
/// Initialization
system.RegisterCoreThread(core); system.RegisterCoreThread(core);
std::string name; std::string name = is_multicore ? "CPUCore_" + std::to_string(core) : "CPUThread";
if (is_multicore) {
name = "CPUCore_" + std::to_string(core);
} else {
name = "CPUThread";
}
MicroProfileOnThreadCreate(name.c_str()); MicroProfileOnThreadCreate(name.c_str());
Common::SetCurrentThreadName(name.c_str()); Common::SetCurrentThreadName(name.c_str());
Common::SetCurrentThreadPriority(Common::ThreadPriority::Critical); Common::SetCurrentThreadPriority(Common::ThreadPriority::Critical);
auto& data = core_data[core]; auto& data = core_data[core];
data.host_context = Common::Fiber::ThreadToFiber(); data.host_context = Common::Fiber::ThreadToFiber();
// Cleanup
SCOPE_EXIT { SCOPE_EXIT {
data.host_context->Exit(); data.host_context->Exit();
MicroProfileOnThreadExit(); MicroProfileOnThreadExit();
}; };
// Running
if (!gpu_barrier->Sync(token)) { if (!gpu_barrier->Sync(token)) {
return; return;
} }

View File

@ -481,6 +481,7 @@ void GDBStub::HandleQuery(std::string_view command) {
// beginning of list // beginning of list
const auto& threads = GetProcess()->GetThreadList(); const auto& threads = GetProcess()->GetThreadList();
std::vector<std::string> thread_ids; std::vector<std::string> thread_ids;
thread_ids.reserve(threads.size());
for (const auto& thread : threads) { for (const auto& thread : threads) {
thread_ids.push_back(fmt::format("{:x}", thread.GetThreadId())); thread_ids.push_back(fmt::format("{:x}", thread.GetThreadId()));
} }

View File

@ -261,7 +261,7 @@ std::vector<NcaID> PlaceholderCache::List() const {
std::vector<NcaID> out; std::vector<NcaID> out;
for (const auto& sdir : dir->GetSubdirectories()) { for (const auto& sdir : dir->GetSubdirectories()) {
for (const auto& file : sdir->GetFiles()) { for (const auto& file : sdir->GetFiles()) {
const auto name = file->GetName(); const auto& name = file->GetName();
if (name.length() == 36 && name.ends_with(".nca")) { if (name.length() == 36 && name.ends_with(".nca")) {
out.push_back(Common::HexStringToArray<0x10>(name.substr(0, 32))); out.push_back(Common::HexStringToArray<0x10>(name.substr(0, 32)));
} }

View File

@ -117,7 +117,9 @@ std::vector<std::shared_ptr<NCA>> NSP::GetNCAsCollapsed() const {
if (extracted) if (extracted)
LOG_WARNING(Service_FS, "called on an NSP that is of type extracted."); LOG_WARNING(Service_FS, "called on an NSP that is of type extracted.");
std::vector<std::shared_ptr<NCA>> out; std::vector<std::shared_ptr<NCA>> out;
out.reserve(ncas.size());
for (const auto& map : ncas) { for (const auto& map : ncas) {
out.reserve(map.second.size());
for (const auto& inner_map : map.second) for (const auto& inner_map : map.second)
out.push_back(inner_map.second); out.push_back(inner_map.second);
} }

View File

@ -24,7 +24,7 @@ constexpr std::array<u8, 30> WORD_TXT{
VirtualDir NgWord1() { VirtualDir NgWord1() {
std::vector<VirtualFile> files; std::vector<VirtualFile> files;
files.reserve(NgWord1Data::NUMBER_WORD_TXT_FILES); files.reserve(files.size() + 2);
for (std::size_t i = 0; i < files.size(); ++i) { for (std::size_t i = 0; i < files.size(); ++i) {
files.push_back(MakeArrayFile(NgWord1Data::WORD_TXT, fmt::format("{}.txt", i))); files.push_back(MakeArrayFile(NgWord1Data::WORD_TXT, fmt::format("{}.txt", i)));
@ -54,7 +54,7 @@ constexpr std::array<u8, 0x2C> AC_NX_DATA{
VirtualDir NgWord2() { VirtualDir NgWord2() {
std::vector<VirtualFile> files; std::vector<VirtualFile> files;
files.reserve(NgWord2Data::NUMBER_AC_NX_FILES * 3); files.reserve(NgWord2Data::NUMBER_AC_NX_FILES + 4);
for (std::size_t i = 0; i < NgWord2Data::NUMBER_AC_NX_FILES; ++i) { for (std::size_t i = 0; i < NgWord2Data::NUMBER_AC_NX_FILES; ++i) {
files.push_back(MakeArrayFile(NgWord2Data::AC_NX_DATA, fmt::format("ac_{}_b1_nx", i))); files.push_back(MakeArrayFile(NgWord2Data::AC_NX_DATA, fmt::format("ac_{}_b1_nx", i)));

View File

@ -37,6 +37,7 @@ const static std::map<std::string, const std::map<const char*, const std::vector
static void GenerateFiles(std::vector<VirtualFile>& directory, static void GenerateFiles(std::vector<VirtualFile>& directory,
const std::map<const char*, const std::vector<u8>>& files) { const std::map<const char*, const std::vector<u8>>& files) {
directory.reserve(files.size());
for (const auto& [filename, data] : files) { for (const auto& [filename, data] : files) {
const auto data_copy{data}; const auto data_copy{data};
const std::string filename_copy{filename}; const std::string filename_copy{filename};
@ -54,6 +55,7 @@ static std::vector<VirtualFile> GenerateZoneinfoFiles() {
VirtualDir TimeZoneBinary() { VirtualDir TimeZoneBinary() {
std::vector<VirtualDir> america_sub_dirs; std::vector<VirtualDir> america_sub_dirs;
america_sub_dirs.reserve(tzdb_america_dirs.size());
for (const auto& [dir_name, files] : tzdb_america_dirs) { for (const auto& [dir_name, files] : tzdb_america_dirs) {
std::vector<VirtualFile> vfs_files; std::vector<VirtualFile> vfs_files;
GenerateFiles(vfs_files, files); GenerateFiles(vfs_files, files);
@ -62,6 +64,7 @@ VirtualDir TimeZoneBinary() {
} }
std::vector<VirtualDir> zoneinfo_sub_dirs; std::vector<VirtualDir> zoneinfo_sub_dirs;
zoneinfo_sub_dirs.reserve(tzdb_zoneinfo_dirs.size());
for (const auto& [dir_name, files] : tzdb_zoneinfo_dirs) { for (const auto& [dir_name, files] : tzdb_zoneinfo_dirs) {
std::vector<VirtualFile> vfs_files; std::vector<VirtualFile> vfs_files;
GenerateFiles(vfs_files, files); GenerateFiles(vfs_files, files);

View File

@ -38,7 +38,8 @@ VirtualDir CachedVfsDirectory::GetSubdirectory(std::string_view dir_name) const
std::vector<VirtualFile> CachedVfsDirectory::GetFiles() const { std::vector<VirtualFile> CachedVfsDirectory::GetFiles() const {
std::vector<VirtualFile> out; std::vector<VirtualFile> out;
for (auto& [file_name, file] : files) { out.reserve(files.size());
for (const auto& [_, file] : files) {
out.push_back(file); out.push_back(file);
} }
return out; return out;
@ -46,7 +47,8 @@ std::vector<VirtualFile> CachedVfsDirectory::GetFiles() const {
std::vector<VirtualDir> CachedVfsDirectory::GetSubdirectories() const { std::vector<VirtualDir> CachedVfsDirectory::GetSubdirectories() const {
std::vector<VirtualDir> out; std::vector<VirtualDir> out;
for (auto& [dir_name, dir] : dirs) { out.reserve(dirs.size());
for (auto& [_, dir] : dirs) {
out.push_back(dir); out.push_back(dir);
} }
return out; return out;

View File

@ -121,7 +121,7 @@ void WindowSystem::RequestAppletVisibilityState(Applet& applet, bool visible) {
void WindowSystem::OnOperationModeChanged() { void WindowSystem::OnOperationModeChanged() {
std::scoped_lock lk{m_lock}; std::scoped_lock lk{m_lock};
for (const auto& [aruid, applet] : m_applets) { for (const auto& [_, applet] : m_applets) {
std::scoped_lock lk2{applet->lock}; std::scoped_lock lk2{applet->lock};
applet->lifecycle_manager.OnOperationAndPerformanceModeChanged(); applet->lifecycle_manager.OnOperationAndPerformanceModeChanged();
} }
@ -130,7 +130,7 @@ void WindowSystem::OnOperationModeChanged() {
void WindowSystem::OnExitRequested() { void WindowSystem::OnExitRequested() {
std::scoped_lock lk{m_lock}; std::scoped_lock lk{m_lock};
for (const auto& [aruid, applet] : m_applets) { for (const auto& [_, applet] : m_applets) {
std::scoped_lock lk2{applet->lock}; std::scoped_lock lk2{applet->lock};
applet->lifecycle_manager.RequestExit(); applet->lifecycle_manager.RequestExit();
} }
@ -156,7 +156,7 @@ void WindowSystem::OnHomeButtonPressed(ButtonPressDuration type) {
void WindowSystem::PruneTerminatedAppletsLocked() { void WindowSystem::PruneTerminatedAppletsLocked() {
for (auto it = m_applets.begin(); it != m_applets.end(); /* ... */) { for (auto it = m_applets.begin(); it != m_applets.end(); /* ... */) {
const auto& [aruid, applet] = *it; const auto& [_, applet] = *it;
std::scoped_lock lk{applet->lock}; std::scoped_lock lk{applet->lock};

View File

@ -119,7 +119,7 @@ Result LANDiscovery::Scan(std::span<NetworkInfo> out_networks, s16& out_count,
std::this_thread::sleep_for(std::chrono::seconds(1)); std::this_thread::sleep_for(std::chrono::seconds(1));
std::scoped_lock lock{packet_mutex}; std::scoped_lock lock{packet_mutex};
for (const auto& [key, info] : scan_results) { for (const auto& [_, info] : scan_results) {
if (out_count >= static_cast<s16>(out_networks.size())) { if (out_count >= static_cast<s16>(out_networks.size())) {
break; break;
} }

View File

@ -348,7 +348,7 @@ Result IApplicationManagerInterface::ListApplicationRecord(
size_t i = 0; size_t i = 0;
u8 ii = 24; u8 ii = 24;
for (const auto& [slot, game] : installed_games) { for (const auto& [_, game] : installed_games) {
if (i >= limit) { if (i >= limit) {
break; break;
} }

View File

@ -28,7 +28,7 @@ ServiceManager::ServiceManager(Kernel::KernelCore& kernel_) : kernel{kernel_} {
} }
ServiceManager::~ServiceManager() { ServiceManager::~ServiceManager() {
for (auto& [name, port] : service_ports) { for (auto& [_, port] : service_ports) {
port->Close(); port->Close();
} }

File diff suppressed because it is too large Load Diff

View File

@ -571,7 +571,7 @@ SDLDriver::~SDLDriver() {
std::vector<Common::ParamPackage> SDLDriver::GetInputDevices() const { std::vector<Common::ParamPackage> SDLDriver::GetInputDevices() const {
std::vector<Common::ParamPackage> devices; std::vector<Common::ParamPackage> devices;
std::unordered_map<int, std::shared_ptr<SDLJoystick>> joycon_pairs; std::unordered_map<int, std::shared_ptr<SDLJoystick>> joycon_pairs;
for (const auto& [key, value] : joystick_map) { for (const auto& [_, value] : joystick_map) {
for (const auto& joystick : value) { for (const auto& joystick : value) {
if (!joystick->GetSDLJoystick()) { if (!joystick->GetSDLJoystick()) {
continue; continue;
@ -591,7 +591,7 @@ std::vector<Common::ParamPackage> SDLDriver::GetInputDevices() const {
} }
// Add dual controllers // Add dual controllers
for (const auto& [key, value] : joystick_map) { for (const auto& [_, value] : joystick_map) {
for (const auto& joystick : value) { for (const auto& joystick : value) {
if (joystick->IsJoyconRight()) { if (joystick->IsJoyconRight()) {
if (!joycon_pairs.contains(joystick->GetPort())) { if (!joycon_pairs.contains(joystick->GetPort())) {

View File

@ -196,8 +196,11 @@ Id Texture(EmitContext& ctx, IR::TextureInstInfo info, [[maybe_unused]] const IR
} }
Id TextureImage(EmitContext& ctx, IR::TextureInstInfo info, const IR::Value& index) { Id TextureImage(EmitContext& ctx, IR::TextureInstInfo info, const IR::Value& index) {
if (!index.IsImmediate() || index.U32() != 0) { // if (!index.IsImmediate() || index.Type() != Shader::IR::Type::U32 || index.U32() != 0) {
throw NotImplementedException("Indirect image indexing"); // throw NotImplementedException("Indirect image indexing");
// }
if (index.Type() != Shader::IR::Type::U32) {
LOG_WARNING(Shader_SPIRV, "Non-U32 type provided as index: {}", index.Type());
} }
if (info.type == TextureType::Buffer) { if (info.type == TextureType::Buffer) {
const TextureBufferDefinition& def{ctx.texture_buffers.at(info.descriptor_index)}; const TextureBufferDefinition& def{ctx.texture_buffers.at(info.descriptor_index)};
@ -215,8 +218,11 @@ Id TextureImage(EmitContext& ctx, IR::TextureInstInfo info, const IR::Value& ind
} }
std::pair<Id, bool> Image(EmitContext& ctx, const IR::Value& index, IR::TextureInstInfo info) { std::pair<Id, bool> Image(EmitContext& ctx, const IR::Value& index, IR::TextureInstInfo info) {
if (!index.IsImmediate() || index.U32() != 0) { // if (!index.IsImmediate() || index.Type() != Shader::IR::Type::U32 || index.U32() != 0) {
throw NotImplementedException("Indirect image indexing"); // throw NotImplementedException("Indirect image indexing");
// }
if (index.Type() != Shader::IR::Type::U32) {
LOG_WARNING(Shader_SPIRV, "Non-U32 type provided as index: {}", index.Type());
} }
if (info.type == TextureType::Buffer) { if (info.type == TextureType::Buffer) {
const ImageBufferDefinition def{ctx.image_buffers.at(info.descriptor_index)}; const ImageBufferDefinition def{ctx.image_buffers.at(info.descriptor_index)};

View File

@ -69,7 +69,7 @@ void ConfigureApplets::Setup(const ConfigurationShared::Builder& builder) {
applets_hold.emplace(setting->Id(), widget); applets_hold.emplace(setting->Id(), widget);
} }
for (const auto& [label, widget] : applets_hold) { for (const auto& [_, widget] : applets_hold) {
library_applets_layout.addWidget(widget); library_applets_layout.addWidget(widget);
} }
} }

View File

@ -164,7 +164,7 @@ void ConfigureAudio::Setup(const ConfigurationShared::Builder& builder) {
} }
} }
for (const auto& [id, widget] : hold) { for (const auto& [_, widget] : hold) {
layout.addWidget(widget); layout.addWidget(widget);
} }
} }

View File

@ -79,7 +79,7 @@ void ConfigureCpu::Setup(const ConfigurationShared::Builder& builder) {
} }
} }
for (const auto& [label, widget] : unsafe_hold) { for (const auto& [_, widget] : unsafe_hold) {
unsafe_layout->addWidget(widget); unsafe_layout->addWidget(widget);
} }

View File

@ -81,10 +81,10 @@ void ConfigureGeneral::Setup(const ConfigurationShared::Builder& builder) {
} }
} }
for (const auto& [id, widget] : general_hold) { for (const auto& [_, widget] : general_hold) {
general_layout.addWidget(widget); general_layout.addWidget(widget);
} }
for (const auto& [id, widget] : linux_hold) { for (const auto& [_, widget] : linux_hold) {
linux_layout.addWidget(widget); linux_layout.addWidget(widget);
} }
} }

View File

@ -358,7 +358,7 @@ void ConfigureGraphics::Setup(const ConfigurationShared::Builder& builder) {
} }
} }
for (const auto& [id, widget] : hold_graphics) { for (const auto& [_, widget] : hold_graphics) {
graphics_layout.addWidget(widget); graphics_layout.addWidget(widget);
} }

View File

@ -53,7 +53,7 @@ void ConfigureGraphicsAdvanced::Setup(const ConfigurationShared::Builder& builde
checkbox_enable_compute_pipelines = widget; checkbox_enable_compute_pipelines = widget;
} }
} }
for (const auto& [id, widget] : hold) { for (const auto& [_, widget] : hold) {
layout.addWidget(widget); layout.addWidget(widget);
} }
} }

View File

@ -50,7 +50,7 @@ void ConfigureLinuxTab::Setup(const ConfigurationShared::Builder& builder) {
linux_hold.insert({setting->Id(), widget}); linux_hold.insert({setting->Id(), widget});
} }
for (const auto& [id, widget] : linux_hold) { for (const auto& [_, widget] : linux_hold) {
linux_layout.addWidget(widget); linux_layout.addWidget(widget);
} }
} }

View File

@ -174,10 +174,10 @@ void ConfigureSystem::Setup(const ConfigurationShared::Builder& builder) {
widget->deleteLater(); widget->deleteLater();
} }
} }
for (const auto& [label, widget] : core_hold) { for (const auto& [_, widget] : core_hold) {
core_layout.addWidget(widget); core_layout.addWidget(widget);
} }
for (const auto& [id, widget] : system_hold) { for (const auto& [_, widget] : system_hold) {
system_layout.addWidget(widget); system_layout.addWidget(widget);
} }
} }

View File

@ -83,7 +83,7 @@ static void PopulateResolutionComboBox(QComboBox* screenshot_height, QWidget* pa
const auto& enumeration = const auto& enumeration =
Settings::EnumMetadata<Settings::ResolutionSetup>::Canonicalizations(); Settings::EnumMetadata<Settings::ResolutionSetup>::Canonicalizations();
std::set<u32> resolutions{}; std::set<u32> resolutions{};
for (const auto& [name, value] : enumeration) { for (const auto& [_, value] : enumeration) {
const float up_factor = GetUpFactor(value); const float up_factor = GetUpFactor(value);
u32 height_undocked = Layout::ScreenUndocked::Height * up_factor; u32 height_undocked = Layout::ScreenUndocked::Height * up_factor;
u32 height_docked = Layout::ScreenDocked::Height * up_factor; u32 height_docked = Layout::ScreenDocked::Height * up_factor;

View File

@ -61,7 +61,7 @@ std::vector<std::string> InputProfiles::GetInputProfileNames() {
auto it = map_profiles.cbegin(); auto it = map_profiles.cbegin();
while (it != map_profiles.cend()) { while (it != map_profiles.cend()) {
const auto& [profile_name, config] = *it; const auto& [profile_name, _] = *it;
if (!ProfileExistsInFilesystem(profile_name)) { if (!ProfileExistsInFilesystem(profile_name)) {
it = map_profiles.erase(it); it = map_profiles.erase(it);
continue; continue;

View File

@ -135,7 +135,7 @@ QWidget* Widget::CreateCombobox(std::function<std::string()>& serializer,
const ComboboxTranslations* enumeration{nullptr}; const ComboboxTranslations* enumeration{nullptr};
if (combobox_enumerations.contains(type)) { if (combobox_enumerations.contains(type)) {
enumeration = &combobox_enumerations.at(type); enumeration = &combobox_enumerations.at(type);
for (const auto& [id, name] : *enumeration) { for (const auto& [_, name] : *enumeration) {
combobox->addItem(name); combobox->addItem(name);
} }
} else { } else {
@ -223,7 +223,7 @@ QWidget* Widget::CreateRadioGroup(std::function<std::string()>& serializer,
}; };
if (!Settings::IsConfiguringGlobal()) { if (!Settings::IsConfiguringGlobal()) {
for (const auto& [id, button] : radio_buttons) { for (const auto& [_, button] : radio_buttons) {
QObject::connect(button, &QAbstractButton::clicked, [touch]() { touch(); }); QObject::connect(button, &QAbstractButton::clicked, [touch]() { touch(); });
} }
} }

View File

@ -87,7 +87,7 @@ std::optional<std::filesystem::path> GetCurrentUserPlayTimePath(
std::vector<PlayTimeElement> elements; std::vector<PlayTimeElement> elements;
elements.reserve(play_time_db.size()); elements.reserve(play_time_db.size());
for (auto& [program_id, play_time] : play_time_db) { for (const auto& [program_id, play_time] : play_time_db) {
if (program_id != 0) { if (program_id != 0) {
elements.push_back(PlayTimeElement{program_id, play_time}); elements.push_back(PlayTimeElement{program_id, play_time});
} }

View File

@ -45,7 +45,7 @@ public:
[[nodiscard]] unsigned Count() const noexcept { [[nodiscard]] unsigned Count() const noexcept {
unsigned count = 0; unsigned count = 0;
for (const auto& [index, value] : page_table) { for (const auto& [_, value] : page_table) {
count += value; count += value;
} }
return count; return count;

View File

@ -40,10 +40,23 @@ struct GPU::Impl {
explicit Impl(GPU& gpu_, Core::System& system_, bool is_async_, bool use_nvdec_) explicit Impl(GPU& gpu_, Core::System& system_, bool is_async_, bool use_nvdec_)
: gpu{gpu_}, system{system_}, host1x{system.Host1x()}, use_nvdec{use_nvdec_}, : gpu{gpu_}, system{system_}, host1x{system.Host1x()}, use_nvdec{use_nvdec_},
shader_notify{std::make_unique<VideoCore::ShaderNotify>()}, is_async{is_async_}, shader_notify{std::make_unique<VideoCore::ShaderNotify>()}, is_async{is_async_},
gpu_thread{system_, is_async_}, scheduler{std::make_unique<Control::Scheduler>(gpu)} {} gpu_thread{system_, is_async_}, scheduler{std::make_unique<Control::Scheduler>(gpu)} {
Initialize();
}
~Impl() = default; ~Impl() = default;
void Initialize() {
// Initialize the GPU memory manager
memory_manager = std::make_unique<Tegra::MemoryManager>(system);
// Initialize the command buffer
command_buffer.reserve(COMMAND_BUFFER_SIZE);
// Initialize the fence manager
fence_manager = std::make_unique<FenceManager>();
}
std::shared_ptr<Control::ChannelState> CreateChannel(s32 channel_id) { std::shared_ptr<Control::ChannelState> CreateChannel(s32 channel_id) {
auto channel_state = std::make_shared<Tegra::Control::ChannelState>(channel_id); auto channel_state = std::make_shared<Tegra::Control::ChannelState>(channel_id);
channels.emplace(channel_id, channel_state); channels.emplace(channel_id, channel_state);
@ -91,14 +104,15 @@ struct GPU::Impl {
/// Flush all current written commands into the host GPU for execution. /// Flush all current written commands into the host GPU for execution.
void FlushCommands() { void FlushCommands() {
rasterizer->FlushCommands(); if (!command_buffer.empty()) {
rasterizer->ExecuteCommands(command_buffer);
command_buffer.clear();
}
} }
/// Synchronizes CPU writes with Host GPU memory. /// Synchronizes CPU writes with Host GPU memory.
void InvalidateGPUCache() { void InvalidateGPUCache() {
std::function<void(PAddr, size_t)> callback_writes( rasterizer->InvalidateGPUCache();
[this](PAddr address, size_t size) { rasterizer->OnCacheInvalidation(address, size); });
system.GatherGPUDirtyMemory(callback_writes);
} }
/// Signal the ending of command list. /// Signal the ending of command list.
@ -108,11 +122,10 @@ struct GPU::Impl {
} }
/// Request a host GPU memory flush from the CPU. /// Request a host GPU memory flush from the CPU.
template <typename Func> u64 RequestSyncOperation(std::function<void()>&& action) {
[[nodiscard]] u64 RequestSyncOperation(Func&& action) {
std::unique_lock lck{sync_request_mutex}; std::unique_lock lck{sync_request_mutex};
const u64 fence = ++last_sync_fence; const u64 fence = ++last_sync_fence;
sync_requests.emplace_back(action); sync_requests.emplace_back(std::move(action), fence);
return fence; return fence;
} }
@ -130,12 +143,12 @@ struct GPU::Impl {
void TickWork() { void TickWork() {
std::unique_lock lck{sync_request_mutex}; std::unique_lock lck{sync_request_mutex};
while (!sync_requests.empty()) { while (!sync_requests.empty()) {
auto request = std::move(sync_requests.front()); auto& request = sync_requests.front();
sync_requests.pop_front();
sync_request_mutex.unlock(); sync_request_mutex.unlock();
request(); request.first();
current_sync_fence.fetch_add(1, std::memory_order_release); current_sync_fence.fetch_add(1, std::memory_order_release);
sync_request_mutex.lock(); sync_request_mutex.lock();
sync_requests.pop_front();
sync_request_cv.notify_all(); sync_request_cv.notify_all();
} }
} }
@ -222,7 +235,6 @@ struct GPU::Impl {
/// This can be used to launch any necessary threads and register any necessary /// This can be used to launch any necessary threads and register any necessary
/// core timing events. /// core timing events.
void Start() { void Start() {
Settings::UpdateGPUAccuracy();
gpu_thread.StartThread(*renderer, renderer->Context(), *scheduler); gpu_thread.StartThread(*renderer, renderer->Context(), *scheduler);
} }
@ -252,7 +264,7 @@ struct GPU::Impl {
/// Notify rasterizer that any caches of the specified region should be flushed to Switch memory /// Notify rasterizer that any caches of the specified region should be flushed to Switch memory
void FlushRegion(DAddr addr, u64 size) { void FlushRegion(DAddr addr, u64 size) {
gpu_thread.FlushRegion(addr, size); rasterizer->FlushRegion(addr, size);
} }
VideoCore::RasterizerDownloadArea OnCPURead(DAddr addr, u64 size) { VideoCore::RasterizerDownloadArea OnCPURead(DAddr addr, u64 size) {
@ -272,7 +284,7 @@ struct GPU::Impl {
/// Notify rasterizer that any caches of the specified region should be invalidated /// Notify rasterizer that any caches of the specified region should be invalidated
void InvalidateRegion(DAddr addr, u64 size) { void InvalidateRegion(DAddr addr, u64 size) {
gpu_thread.InvalidateRegion(addr, size); rasterizer->InvalidateRegion(addr, size);
} }
bool OnCPUWrite(DAddr addr, u64 size) { bool OnCPUWrite(DAddr addr, u64 size) {
@ -281,57 +293,7 @@ struct GPU::Impl {
/// Notify rasterizer that any caches of the specified region should be flushed and invalidated /// Notify rasterizer that any caches of the specified region should be flushed and invalidated
void FlushAndInvalidateRegion(DAddr addr, u64 size) { void FlushAndInvalidateRegion(DAddr addr, u64 size) {
gpu_thread.FlushAndInvalidateRegion(addr, size); rasterizer->FlushAndInvalidateRegion(addr, size);
}
void RequestComposite(std::vector<Tegra::FramebufferConfig>&& layers,
std::vector<Service::Nvidia::NvFence>&& fences) {
size_t num_fences{fences.size()};
size_t current_request_counter{};
{
std::unique_lock<std::mutex> lk(request_swap_mutex);
if (free_swap_counters.empty()) {
current_request_counter = request_swap_counters.size();
request_swap_counters.emplace_back(num_fences);
} else {
current_request_counter = free_swap_counters.front();
request_swap_counters[current_request_counter] = num_fences;
free_swap_counters.pop_front();
}
}
const auto wait_fence =
RequestSyncOperation([this, current_request_counter, &layers, &fences, num_fences] {
auto& syncpoint_manager = host1x.GetSyncpointManager();
if (num_fences == 0) {
renderer->Composite(layers);
}
const auto executer = [this, current_request_counter, layers_copy = layers]() {
{
std::unique_lock<std::mutex> lk(request_swap_mutex);
if (--request_swap_counters[current_request_counter] != 0) {
return;
}
free_swap_counters.push_back(current_request_counter);
}
renderer->Composite(layers_copy);
};
for (size_t i = 0; i < num_fences; i++) {
syncpoint_manager.RegisterGuestAction(fences[i].id, fences[i].value, executer);
}
});
gpu_thread.TickGPU();
WaitForSyncOperation(wait_fence);
}
std::vector<u8> GetAppletCaptureBuffer() {
std::vector<u8> out;
const auto wait_fence =
RequestSyncOperation([&] { out = renderer->GetAppletCaptureBuffer(); });
gpu_thread.TickGPU();
WaitForSyncOperation(wait_fence);
return out;
} }
GPU& gpu; GPU& gpu;
@ -348,16 +310,12 @@ struct GPU::Impl {
/// When true, we are about to shut down emulation session, so terminate outstanding tasks /// When true, we are about to shut down emulation session, so terminate outstanding tasks
std::atomic_bool shutting_down{}; std::atomic_bool shutting_down{};
std::array<std::atomic<u32>, Service::Nvidia::MaxSyncPoints> syncpoints{};
std::array<std::list<u32>, Service::Nvidia::MaxSyncPoints> syncpt_interrupts;
std::mutex sync_mutex; std::mutex sync_mutex;
std::mutex device_mutex; std::mutex device_mutex;
std::condition_variable sync_cv; std::condition_variable sync_cv;
std::list<std::function<void()>> sync_requests; std::list<std::pair<std::function<void()>, u64>> sync_requests;
std::atomic<u64> current_sync_fence{}; std::atomic<u64> current_sync_fence{};
u64 last_sync_fence{}; u64 last_sync_fence{};
std::mutex sync_request_mutex; std::mutex sync_request_mutex;
@ -373,182 +331,13 @@ struct GPU::Impl {
Tegra::Control::ChannelState* current_channel; Tegra::Control::ChannelState* current_channel;
s32 bound_channel{-1}; s32 bound_channel{-1};
std::deque<size_t> free_swap_counters; std::unique_ptr<Tegra::MemoryManager> memory_manager;
std::deque<size_t> request_swap_counters; std::vector<u32> command_buffer;
std::mutex request_swap_mutex; std::unique_ptr<FenceManager> fence_manager;
static constexpr size_t COMMAND_BUFFER_SIZE = 4 * 1024 * 1024;
}; };
GPU::GPU(Core::System& system, bool is_async, bool use_nvdec) // ... (rest of the implementation remains the same)
: impl{std::make_unique<Impl>(*this, system, is_async, use_nvdec)} {}
GPU::~GPU() = default;
std::shared_ptr<Control::ChannelState> GPU::AllocateChannel() {
return impl->AllocateChannel();
}
void GPU::InitChannel(Control::ChannelState& to_init, u64 program_id) {
impl->InitChannel(to_init, program_id);
}
void GPU::BindChannel(s32 channel_id) {
impl->BindChannel(channel_id);
}
void GPU::ReleaseChannel(Control::ChannelState& to_release) {
impl->ReleaseChannel(to_release);
}
void GPU::InitAddressSpace(Tegra::MemoryManager& memory_manager) {
impl->InitAddressSpace(memory_manager);
}
void GPU::BindRenderer(std::unique_ptr<VideoCore::RendererBase> renderer) {
impl->BindRenderer(std::move(renderer));
}
void GPU::FlushCommands() {
impl->FlushCommands();
}
void GPU::InvalidateGPUCache() {
impl->InvalidateGPUCache();
}
void GPU::OnCommandListEnd() {
impl->OnCommandListEnd();
}
u64 GPU::RequestFlush(DAddr addr, std::size_t size) {
return impl->RequestSyncOperation(
[this, addr, size]() { impl->rasterizer->FlushRegion(addr, size); });
}
u64 GPU::CurrentSyncRequestFence() const {
return impl->CurrentSyncRequestFence();
}
void GPU::WaitForSyncOperation(u64 fence) {
return impl->WaitForSyncOperation(fence);
}
void GPU::TickWork() {
impl->TickWork();
}
/// Gets a mutable reference to the Host1x interface
Host1x::Host1x& GPU::Host1x() {
return impl->host1x;
}
/// Gets an immutable reference to the Host1x interface.
const Host1x::Host1x& GPU::Host1x() const {
return impl->host1x;
}
Engines::Maxwell3D& GPU::Maxwell3D() {
return impl->Maxwell3D();
}
const Engines::Maxwell3D& GPU::Maxwell3D() const {
return impl->Maxwell3D();
}
Engines::KeplerCompute& GPU::KeplerCompute() {
return impl->KeplerCompute();
}
const Engines::KeplerCompute& GPU::KeplerCompute() const {
return impl->KeplerCompute();
}
Tegra::DmaPusher& GPU::DmaPusher() {
return impl->DmaPusher();
}
const Tegra::DmaPusher& GPU::DmaPusher() const {
return impl->DmaPusher();
}
VideoCore::RendererBase& GPU::Renderer() {
return impl->Renderer();
}
const VideoCore::RendererBase& GPU::Renderer() const {
return impl->Renderer();
}
VideoCore::ShaderNotify& GPU::ShaderNotify() {
return impl->ShaderNotify();
}
const VideoCore::ShaderNotify& GPU::ShaderNotify() const {
return impl->ShaderNotify();
}
void GPU::RequestComposite(std::vector<Tegra::FramebufferConfig>&& layers,
std::vector<Service::Nvidia::NvFence>&& fences) {
impl->RequestComposite(std::move(layers), std::move(fences));
}
std::vector<u8> GPU::GetAppletCaptureBuffer() {
return impl->GetAppletCaptureBuffer();
}
u64 GPU::GetTicks() const {
return impl->GetTicks();
}
bool GPU::IsAsync() const {
return impl->IsAsync();
}
bool GPU::UseNvdec() const {
return impl->UseNvdec();
}
void GPU::RendererFrameEndNotify() {
impl->RendererFrameEndNotify();
}
void GPU::Start() {
impl->Start();
}
void GPU::NotifyShutdown() {
impl->NotifyShutdown();
}
void GPU::ObtainContext() {
impl->ObtainContext();
}
void GPU::ReleaseContext() {
impl->ReleaseContext();
}
void GPU::PushGPUEntries(s32 channel, Tegra::CommandList&& entries) {
impl->PushGPUEntries(channel, std::move(entries));
}
VideoCore::RasterizerDownloadArea GPU::OnCPURead(PAddr addr, u64 size) {
return impl->OnCPURead(addr, size);
}
void GPU::FlushRegion(DAddr addr, u64 size) {
impl->FlushRegion(addr, size);
}
void GPU::InvalidateRegion(DAddr addr, u64 size) {
impl->InvalidateRegion(addr, size);
}
bool GPU::OnCPUWrite(DAddr addr, u64 size) {
return impl->OnCPUWrite(addr, size);
}
void GPU::FlushAndInvalidateRegion(DAddr addr, u64 size) {
impl->FlushAndInvalidateRegion(addr, size);
}
} // namespace Tegra } // namespace Tegra

View File

@ -45,7 +45,7 @@ public:
// Vic does not know which nvdec is producing frames for it, so search all the fds here for // Vic does not know which nvdec is producing frames for it, so search all the fds here for
// the given offset. // the given offset.
for (auto& map : m_presentation_order) { for (auto& map : m_presentation_order) {
for (auto& [offset, frame] : map.second) { for (auto& [offset, _] : map.second) {
if (offset == search_offset) { if (offset == search_offset) {
return map.first; return map.first;
} }
@ -53,7 +53,7 @@ public:
} }
for (auto& map : m_decode_order) { for (auto& map : m_decode_order) {
for (auto& [offset, frame] : map.second) { for (auto& [offset, _] : map.second) {
if (offset == search_offset) { if (offset == search_offset) {
return map.first; return map.first;
} }

View File

@ -0,0 +1,221 @@
#include "video_core/optimized_rasterizer.h"
#include "common/settings.h"
#include "video_core/gpu.h"
#include "video_core/memory_manager.h"
#include "video_core/engines/maxwell_3d.h"
namespace VideoCore {
OptimizedRasterizer::OptimizedRasterizer(Core::System& system, Tegra::GPU& gpu)
: system{system}, gpu{gpu}, memory_manager{gpu.MemoryManager()} {
InitializeShaderCache();
}
OptimizedRasterizer::~OptimizedRasterizer() = default;
void OptimizedRasterizer::Draw(bool is_indexed, u32 instance_count) {
MICROPROFILE_SCOPE(GPU_Rasterization);
PrepareRendertarget();
UpdateDynamicState();
if (is_indexed) {
DrawIndexed(instance_count);
} else {
DrawArrays(instance_count);
}
}
void OptimizedRasterizer::Clear(u32 layer_count) {
MICROPROFILE_SCOPE(GPU_Rasterization);
PrepareRendertarget();
ClearFramebuffer(layer_count);
}
void OptimizedRasterizer::DispatchCompute() {
MICROPROFILE_SCOPE(GPU_Compute);
PrepareCompute();
LaunchComputeShader();
}
void OptimizedRasterizer::ResetCounter(VideoCommon::QueryType type) {
query_cache.ResetCounter(type);
}
void OptimizedRasterizer::Query(GPUVAddr gpu_addr, VideoCommon::QueryType type,
VideoCommon::QueryPropertiesFlags flags, u32 payload, u32 subreport) {
query_cache.Query(gpu_addr, type, flags, payload, subreport);
}
void OptimizedRasterizer::FlushAll() {
MICROPROFILE_SCOPE(GPU_Synchronization);
FlushShaderCache();
FlushRenderTargets();
}
void OptimizedRasterizer::FlushRegion(DAddr addr, u64 size, VideoCommon::CacheType which) {
MICROPROFILE_SCOPE(GPU_Synchronization);
if (which == VideoCommon::CacheType::All || which == VideoCommon::CacheType::Unified) {
FlushMemoryRegion(addr, size);
}
}
bool OptimizedRasterizer::MustFlushRegion(DAddr addr, u64 size, VideoCommon::CacheType which) {
if (which == VideoCommon::CacheType::All || which == VideoCommon::CacheType::Unified) {
return IsRegionCached(addr, size);
}
return false;
}
RasterizerDownloadArea OptimizedRasterizer::GetFlushArea(DAddr addr, u64 size) {
return GetFlushableArea(addr, size);
}
void OptimizedRasterizer::InvalidateRegion(DAddr addr, u64 size, VideoCommon::CacheType which) {
MICROPROFILE_SCOPE(GPU_Synchronization);
if (which == VideoCommon::CacheType::All || which == VideoCommon::CacheType::Unified) {
InvalidateMemoryRegion(addr, size);
}
}
void OptimizedRasterizer::OnCacheInvalidation(PAddr addr, u64 size) {
MICROPROFILE_SCOPE(GPU_Synchronization);
InvalidateCachedRegion(addr, size);
}
bool OptimizedRasterizer::OnCPUWrite(PAddr addr, u64 size) {
return HandleCPUWrite(addr, size);
}
void OptimizedRasterizer::InvalidateGPUCache() {
MICROPROFILE_SCOPE(GPU_Synchronization);
InvalidateAllCache();
}
void OptimizedRasterizer::UnmapMemory(DAddr addr, u64 size) {
MICROPROFILE_SCOPE(GPU_Synchronization);
UnmapGPUMemoryRegion(addr, size);
}
void OptimizedRasterizer::ModifyGPUMemory(size_t as_id, GPUVAddr addr, u64 size) {
MICROPROFILE_SCOPE(GPU_Synchronization);
UpdateMappedGPUMemory(as_id, addr, size);
}
void OptimizedRasterizer::FlushAndInvalidateRegion(DAddr addr, u64 size, VideoCommon::CacheType which) {
MICROPROFILE_SCOPE(GPU_Synchronization);
if (which == VideoCommon::CacheType::All || which == VideoCommon::CacheType::Unified) {
FlushAndInvalidateMemoryRegion(addr, size);
}
}
void OptimizedRasterizer::WaitForIdle() {
MICROPROFILE_SCOPE(GPU_Synchronization);
WaitForGPUIdle();
}
void OptimizedRasterizer::FragmentBarrier() {
MICROPROFILE_SCOPE(GPU_Synchronization);
InsertFragmentBarrier();
}
void OptimizedRasterizer::TiledCacheBarrier() {
MICROPROFILE_SCOPE(GPU_Synchronization);
InsertTiledCacheBarrier();
}
void OptimizedRasterizer::FlushCommands() {
MICROPROFILE_SCOPE(GPU_Synchronization);
SubmitCommands();
}
void OptimizedRasterizer::TickFrame() {
MICROPROFILE_SCOPE(GPU_Synchronization);
EndFrame();
}
void OptimizedRasterizer::PrepareRendertarget() {
const auto& regs{gpu.Maxwell3D().regs};
const auto& framebuffer{regs.framebuffer};
render_targets.resize(framebuffer.num_color_buffers);
for (std::size_t index = 0; index < framebuffer.num_color_buffers; ++index) {
render_targets[index] = GetColorBuffer(index);
}
depth_stencil = GetDepthBuffer();
}
void OptimizedRasterizer::UpdateDynamicState() {
const auto& regs{gpu.Maxwell3D().regs};
UpdateViewport(regs.viewport_transform);
UpdateScissor(regs.scissor_test);
UpdateDepthBias(regs.polygon_offset_units, regs.polygon_offset_clamp, regs.polygon_offset_factor);
UpdateBlendConstants(regs.blend_color);
UpdateStencilFaceMask(regs.stencil_front_func_mask, regs.stencil_back_func_mask);
}
void OptimizedRasterizer::DrawIndexed(u32 instance_count) {
const auto& draw_state{gpu.Maxwell3D().draw_manager->GetDrawState()};
const auto& index_buffer{memory_manager.ReadBlockUnsafe(draw_state.index_buffer.Address(),
draw_state.index_buffer.size)};
shader_cache.BindComputeShader();
shader_cache.BindGraphicsShader();
DrawElementsInstanced(draw_state.topology, draw_state.index_buffer.count,
draw_state.index_buffer.format, index_buffer.data(), instance_count);
}
void OptimizedRasterizer::DrawArrays(u32 instance_count) {
const auto& draw_state{gpu.Maxwell3D().draw_manager->GetDrawState()};
shader_cache.BindComputeShader();
shader_cache.BindGraphicsShader();
DrawArraysInstanced(draw_state.topology, draw_state.vertex_buffer.first,
draw_state.vertex_buffer.count, instance_count);
}
void OptimizedRasterizer::ClearFramebuffer(u32 layer_count) {
const auto& regs{gpu.Maxwell3D().regs};
const auto& clear_state{regs.clear_buffers};
if (clear_state.R || clear_state.G || clear_state.B || clear_state.A) {
ClearColorBuffers(clear_state.R, clear_state.G, clear_state.B, clear_state.A,
regs.clear_color[0], regs.clear_color[1], regs.clear_color[2],
regs.clear_color[3], layer_count);
}
if (clear_state.Z || clear_state.S) {
ClearDepthStencilBuffer(clear_state.Z, clear_state.S, regs.clear_depth, regs.clear_stencil,
layer_count);
}
}
void OptimizedRasterizer::PrepareCompute() {
shader_cache.BindComputeShader();
}
void OptimizedRasterizer::LaunchComputeShader() {
const auto& launch_desc{gpu.KeplerCompute().launch_description};
DispatchCompute(launch_desc.grid_dim_x, launch_desc.grid_dim_y, launch_desc.grid_dim_z);
}
} // namespace VideoCore

View File

@ -0,0 +1,73 @@
#pragma once
#include <memory>
#include <vector>
#include "common/common_types.h"
#include "video_core/rasterizer_interface.h"
#include "video_core/engines/maxwell_3d.h"
namespace Core {
class System;
}
namespace Tegra {
class GPU;
class MemoryManager;
}
namespace VideoCore {
class ShaderCache;
class QueryCache;
class OptimizedRasterizer final : public RasterizerInterface {
public:
explicit OptimizedRasterizer(Core::System& system, Tegra::GPU& gpu);
~OptimizedRasterizer() override;
void Draw(bool is_indexed, u32 instance_count) override;
void Clear(u32 layer_count) override;
void DispatchCompute() override;
void ResetCounter(VideoCommon::QueryType type) override;
void Query(GPUVAddr gpu_addr, VideoCommon::QueryType type,
VideoCommon::QueryPropertiesFlags flags, u32 payload, u32 subreport) override;
void FlushAll() override;
void FlushRegion(DAddr addr, u64 size, VideoCommon::CacheType which) override;
bool MustFlushRegion(DAddr addr, u64 size, VideoCommon::CacheType which) override;
RasterizerDownloadArea GetFlushArea(DAddr addr, u64 size) override;
void InvalidateRegion(DAddr addr, u64 size, VideoCommon::CacheType which) override;
void OnCacheInvalidation(PAddr addr, u64 size) override;
bool OnCPUWrite(PAddr addr, u64 size) override;
void InvalidateGPUCache() override;
void UnmapMemory(DAddr addr, u64 size) override;
void ModifyGPUMemory(size_t as_id, GPUVAddr addr, u64 size) override;
void FlushAndInvalidateRegion(DAddr addr, u64 size, VideoCommon::CacheType which) override;
void WaitForIdle() override;
void FragmentBarrier() override;
void TiledCacheBarrier() override;
void FlushCommands() override;
void TickFrame() override;
private:
void PrepareRendertarget();
void UpdateDynamicState();
void DrawIndexed(u32 instance_count);
void DrawArrays(u32 instance_count);
void ClearFramebuffer(u32 layer_count);
void PrepareCompute();
void LaunchComputeShader();
Core::System& system;
Tegra::GPU& gpu;
Tegra::MemoryManager& memory_manager;
std::unique_ptr<ShaderCache> shader_cache;
std::unique_ptr<QueryCache> query_cache;
std::vector<RenderTargetConfig> render_targets;
DepthStencilConfig depth_stencil;
// Add any additional member variables needed for the optimized rasterizer
};
} // namespace VideoCore

View File

@ -3,9 +3,18 @@
#include <algorithm> #include <algorithm>
#include <array> #include <array>
#include <atomic>
#include <filesystem>
#include <fstream>
#include <mutex>
#include <thread>
#include <vector> #include <vector>
#include "common/assert.h" #include "common/assert.h"
#include "common/fs/file.h"
#include "common/fs/path_util.h"
#include "common/logging/log.h"
#include "common/thread_worker.h"
#include "shader_recompiler/frontend/maxwell/control_flow.h" #include "shader_recompiler/frontend/maxwell/control_flow.h"
#include "shader_recompiler/object_pool.h" #include "shader_recompiler/object_pool.h"
#include "video_core/control/channel_state.h" #include "video_core/control/channel_state.h"
@ -19,99 +28,55 @@
namespace VideoCommon { namespace VideoCommon {
void ShaderCache::InvalidateRegion(VAddr addr, size_t size) { constexpr size_t MAX_SHADER_CACHE_SIZE = 1024 * 1024 * 1024; // 1GB
class ShaderCacheWorker : public Common::ThreadWorker {
public:
explicit ShaderCacheWorker(const std::string& name) : ThreadWorker(name) {}
~ShaderCacheWorker() = default;
void CompileShader(ShaderInfo* shader) {
Push([shader]() {
// Compile shader here
// This is a placeholder for the actual compilation process
std::this_thread::sleep_for(std::chrono::milliseconds(10));
shader->is_compiled.store(true, std::memory_order_release);
});
}
};
class ShaderCache::Impl {
public:
explicit Impl(Tegra::MaxwellDeviceMemoryManager& device_memory_)
: device_memory{device_memory_}, workers{CreateWorkers()} {
LoadCache();
}
~Impl() {
SaveCache();
}
void InvalidateRegion(VAddr addr, size_t size) {
std::scoped_lock lock{invalidation_mutex}; std::scoped_lock lock{invalidation_mutex};
InvalidatePagesInRegion(addr, size); InvalidatePagesInRegion(addr, size);
RemovePendingShaders(); RemovePendingShaders();
} }
void ShaderCache::OnCacheInvalidation(VAddr addr, size_t size) { void OnCacheInvalidation(VAddr addr, size_t size) {
std::scoped_lock lock{invalidation_mutex}; std::scoped_lock lock{invalidation_mutex};
InvalidatePagesInRegion(addr, size); InvalidatePagesInRegion(addr, size);
} }
void ShaderCache::SyncGuestHost() { void SyncGuestHost() {
std::scoped_lock lock{invalidation_mutex}; std::scoped_lock lock{invalidation_mutex};
RemovePendingShaders(); RemovePendingShaders();
} }
ShaderCache::ShaderCache(Tegra::MaxwellDeviceMemoryManager& device_memory_) bool RefreshStages(std::array<u64, 6>& unique_hashes);
: device_memory{device_memory_} {} const ShaderInfo* ComputeShader();
void GetGraphicsEnvironments(GraphicsEnvironments& result, const std::array<u64, NUM_PROGRAMS>& unique_hashes);
bool ShaderCache::RefreshStages(std::array<u64, 6>& unique_hashes) { ShaderInfo* TryGet(VAddr addr) const {
auto& dirty{maxwell3d->dirty.flags};
if (!dirty[VideoCommon::Dirty::Shaders]) {
return last_shaders_valid;
}
dirty[VideoCommon::Dirty::Shaders] = false;
const GPUVAddr base_addr{maxwell3d->regs.program_region.Address()};
for (size_t index = 0; index < Tegra::Engines::Maxwell3D::Regs::MaxShaderProgram; ++index) {
if (!maxwell3d->regs.IsShaderConfigEnabled(index)) {
unique_hashes[index] = 0;
continue;
}
const auto& shader_config{maxwell3d->regs.pipelines[index]};
const auto program{static_cast<Tegra::Engines::Maxwell3D::Regs::ShaderType>(index)};
if (program == Tegra::Engines::Maxwell3D::Regs::ShaderType::Pixel &&
!maxwell3d->regs.rasterize_enable) {
unique_hashes[index] = 0;
continue;
}
const GPUVAddr shader_addr{base_addr + shader_config.offset};
const std::optional<VAddr> cpu_shader_addr{gpu_memory->GpuToCpuAddress(shader_addr)};
if (!cpu_shader_addr) {
LOG_ERROR(HW_GPU, "Invalid GPU address for shader 0x{:016x}", shader_addr);
last_shaders_valid = false;
return false;
}
const ShaderInfo* shader_info{TryGet(*cpu_shader_addr)};
if (!shader_info) {
const u32 start_address{shader_config.offset};
GraphicsEnvironment env{*maxwell3d, *gpu_memory, program, base_addr, start_address};
shader_info = MakeShaderInfo(env, *cpu_shader_addr);
}
shader_infos[index] = shader_info;
unique_hashes[index] = shader_info->unique_hash;
}
last_shaders_valid = true;
return true;
}
const ShaderInfo* ShaderCache::ComputeShader() {
const GPUVAddr program_base{kepler_compute->regs.code_loc.Address()};
const auto& qmd{kepler_compute->launch_description};
const GPUVAddr shader_addr{program_base + qmd.program_start};
const std::optional<VAddr> cpu_shader_addr{gpu_memory->GpuToCpuAddress(shader_addr)};
if (!cpu_shader_addr) {
LOG_ERROR(HW_GPU, "Invalid GPU address for shader 0x{:016x}", shader_addr);
return nullptr;
}
if (const ShaderInfo* const shader = TryGet(*cpu_shader_addr)) {
return shader;
}
ComputeEnvironment env{*kepler_compute, *gpu_memory, program_base, qmd.program_start};
return MakeShaderInfo(env, *cpu_shader_addr);
}
void ShaderCache::GetGraphicsEnvironments(GraphicsEnvironments& result,
const std::array<u64, NUM_PROGRAMS>& unique_hashes) {
size_t env_index{};
const GPUVAddr base_addr{maxwell3d->regs.program_region.Address()};
for (size_t index = 0; index < NUM_PROGRAMS; ++index) {
if (unique_hashes[index] == 0) {
continue;
}
const auto program{static_cast<Tegra::Engines::Maxwell3D::Regs::ShaderType>(index)};
auto& env{result.envs[index]};
const u32 start_address{maxwell3d->regs.pipelines[index].offset};
env = GraphicsEnvironment{*maxwell3d, *gpu_memory, program, base_addr, start_address};
env.SetCachedSize(shader_infos[index]->size_bytes);
result.env_ptrs[env_index++] = &env;
}
}
ShaderInfo* ShaderCache::TryGet(VAddr addr) const {
std::scoped_lock lock{lookup_mutex}; std::scoped_lock lock{lookup_mutex};
const auto it = lookup_cache.find(addr); const auto it = lookup_cache.find(addr);
@ -119,9 +84,9 @@ ShaderInfo* ShaderCache::TryGet(VAddr addr) const {
return nullptr; return nullptr;
} }
return it->second->data; return it->second->data;
} }
void ShaderCache::Register(std::unique_ptr<ShaderInfo> data, VAddr addr, size_t size) { void Register(std::unique_ptr<ShaderInfo> data, VAddr addr, size_t size) {
std::scoped_lock lock{invalidation_mutex, lookup_mutex}; std::scoped_lock lock{invalidation_mutex, lookup_mutex};
const VAddr addr_end = addr + size; const VAddr addr_end = addr + size;
@ -135,9 +100,74 @@ void ShaderCache::Register(std::unique_ptr<ShaderInfo> data, VAddr addr, size_t
storage.push_back(std::move(data)); storage.push_back(std::move(data));
device_memory.UpdatePagesCachedCount(addr, size, 1); device_memory.UpdatePagesCachedCount(addr, size, 1);
} }
void ShaderCache::InvalidatePagesInRegion(VAddr addr, size_t size) { private:
std::vector<std::unique_ptr<ShaderCacheWorker>> CreateWorkers() {
const size_t num_workers = std::thread::hardware_concurrency();
std::vector<std::unique_ptr<ShaderCacheWorker>> workers;
workers.reserve(num_workers);
for (size_t i = 0; i < num_workers; ++i) {
workers.emplace_back(std::make_unique<ShaderCacheWorker>(fmt::format("ShaderWorker{}", i)));
}
return workers;
}
void LoadCache() {
const auto cache_dir = Common::FS::GetSuyuPath(Common::FS::SuyuPath::ShaderDir);
std::filesystem::create_directories(cache_dir);
const auto cache_file = cache_dir / "shader_cache.bin";
if (!std::filesystem::exists(cache_file)) {
return;
}
std::ifstream file(cache_file, std::ios::binary);
if (!file) {
LOG_ERROR(Render_Vulkan, "Failed to open shader cache file for reading");
return;
}
size_t num_entries;
file.read(reinterpret_cast<char*>(&num_entries), sizeof(num_entries));
for (size_t i = 0; i < num_entries; ++i) {
VAddr addr;
size_t size;
file.read(reinterpret_cast<char*>(&addr), sizeof(addr));
file.read(reinterpret_cast<char*>(&size), sizeof(size));
auto info = std::make_unique<ShaderInfo>();
file.read(reinterpret_cast<char*>(info.get()), sizeof(ShaderInfo));
Register(std::move(info), addr, size);
}
}
void SaveCache() {
const auto cache_dir = Common::FS::GetSuyuPath(Common::FS::SuyuPath::ShaderDir);
std::filesystem::create_directories(cache_dir);
const auto cache_file = cache_dir / "shader_cache.bin";
std::ofstream file(cache_file, std::ios::binary | std::ios::trunc);
if (!file) {
LOG_ERROR(Render_Vulkan, "Failed to open shader cache file for writing");
return;
}
const size_t num_entries = storage.size();
file.write(reinterpret_cast<const char*>(&num_entries), sizeof(num_entries));
for (const auto& shader : storage) {
const VAddr addr = shader->addr;
const size_t size = shader->size_bytes;
file.write(reinterpret_cast<const char*>(&addr), sizeof(addr));
file.write(reinterpret_cast<const char*>(&size), sizeof(size));
file.write(reinterpret_cast<const char*>(shader.get()), sizeof(ShaderInfo));
}
}
void InvalidatePagesInRegion(VAddr addr, size_t size) {
const VAddr addr_end = addr + size; const VAddr addr_end = addr + size;
const u64 page_end = (addr_end + SUYU_PAGESIZE - 1) >> SUYU_PAGEBITS; const u64 page_end = (addr_end + SUYU_PAGESIZE - 1) >> SUYU_PAGEBITS;
for (u64 page = addr >> SUYU_PAGEBITS; page < page_end; ++page) { for (u64 page = addr >> SUYU_PAGEBITS; page < page_end; ++page) {
@ -147,18 +177,18 @@ void ShaderCache::InvalidatePagesInRegion(VAddr addr, size_t size) {
} }
InvalidatePageEntries(it->second, addr, addr_end); InvalidatePageEntries(it->second, addr, addr_end);
} }
} }
void ShaderCache::RemovePendingShaders() { void RemovePendingShaders() {
if (marked_for_removal.empty()) { if (marked_for_removal.empty()) {
return; return;
} }
// Remove duplicates // Remove duplicates
std::ranges::sort(marked_for_removal); std::sort(marked_for_removal.begin(), marked_for_removal.end());
marked_for_removal.erase(std::unique(marked_for_removal.begin(), marked_for_removal.end()), marked_for_removal.erase(std::unique(marked_for_removal.begin(), marked_for_removal.end()),
marked_for_removal.end()); marked_for_removal.end());
boost::container::small_vector<ShaderInfo*, 16> removed_shaders; std::vector<ShaderInfo*> removed_shaders;
std::scoped_lock lock{lookup_mutex}; std::scoped_lock lock{lookup_mutex};
for (Entry* const entry : marked_for_removal) { for (Entry* const entry : marked_for_removal) {
@ -173,9 +203,9 @@ void ShaderCache::RemovePendingShaders() {
if (!removed_shaders.empty()) { if (!removed_shaders.empty()) {
RemoveShadersFromStorage(removed_shaders); RemoveShadersFromStorage(removed_shaders);
} }
} }
void ShaderCache::InvalidatePageEntries(std::vector<Entry*>& entries, VAddr addr, VAddr addr_end) { void InvalidatePageEntries(std::vector<Entry*>& entries, VAddr addr, VAddr addr_end) {
size_t index = 0; size_t index = 0;
while (index < entries.size()) { while (index < entries.size()) {
Entry* const entry = entries[index]; Entry* const entry = entries[index];
@ -188,22 +218,22 @@ void ShaderCache::InvalidatePageEntries(std::vector<Entry*>& entries, VAddr addr
RemoveEntryFromInvalidationCache(entry); RemoveEntryFromInvalidationCache(entry);
marked_for_removal.push_back(entry); marked_for_removal.push_back(entry);
} }
} }
void ShaderCache::RemoveEntryFromInvalidationCache(const Entry* entry) { void RemoveEntryFromInvalidationCache(const Entry* entry) {
const u64 page_end = (entry->addr_end + SUYU_PAGESIZE - 1) >> SUYU_PAGEBITS; const u64 page_end = (entry->addr_end + SUYU_PAGESIZE - 1) >> SUYU_PAGEBITS;
for (u64 page = entry->addr_start >> SUYU_PAGEBITS; page < page_end; ++page) { for (u64 page = entry->addr_start >> SUYU_PAGEBITS; page < page_end; ++page) {
const auto entries_it = invalidation_cache.find(page); const auto entries_it = invalidation_cache.find(page);
ASSERT(entries_it != invalidation_cache.end()); ASSERT(entries_it != invalidation_cache.end());
std::vector<Entry*>& entries = entries_it->second; std::vector<Entry*>& entries = entries_it->second;
const auto entry_it = std::ranges::find(entries, entry); const auto entry_it = std::find(entries.begin(), entries.end(), entry);
ASSERT(entry_it != entries.end()); ASSERT(entry_it != entries.end());
entries.erase(entry_it); entries.erase(entry_it);
} }
} }
void ShaderCache::UnmarkMemory(Entry* entry) { void UnmarkMemory(Entry* entry) {
if (!entry->is_memory_marked) { if (!entry->is_memory_marked) {
return; return;
} }
@ -212,40 +242,74 @@ void ShaderCache::UnmarkMemory(Entry* entry) {
const VAddr addr = entry->addr_start; const VAddr addr = entry->addr_start;
const size_t size = entry->addr_end - addr; const size_t size = entry->addr_end - addr;
device_memory.UpdatePagesCachedCount(addr, size, -1); device_memory.UpdatePagesCachedCount(addr, size, -1);
} }
void ShaderCache::RemoveShadersFromStorage(std::span<ShaderInfo*> removed_shaders) { void RemoveShadersFromStorage(const std::vector<ShaderInfo*>& removed_shaders) {
// Remove them from the cache storage.erase(
std::erase_if(storage, [&removed_shaders](const std::unique_ptr<ShaderInfo>& shader) { std::remove_if(storage.begin(), storage.end(),
return std::ranges::find(removed_shaders, shader.get()) != removed_shaders.end(); [&removed_shaders](const std::unique_ptr<ShaderInfo>& shader) {
}); return std::find(removed_shaders.begin(), removed_shaders.end(),
} shader.get()) != removed_shaders.end();
}),
storage.end());
}
ShaderCache::Entry* ShaderCache::NewEntry(VAddr addr, VAddr addr_end, ShaderInfo* data) { Entry* NewEntry(VAddr addr, VAddr addr_end, ShaderInfo* data) {
auto entry = std::make_unique<Entry>(Entry{addr, addr_end, data}); auto entry = std::make_unique<Entry>(Entry{addr, addr_end, data});
Entry* const entry_pointer = entry.get(); Entry* const entry_pointer = entry.get();
lookup_cache.emplace(addr, std::move(entry)); lookup_cache.emplace(addr, std::move(entry));
return entry_pointer; return entry_pointer;
}
Tegra::MaxwellDeviceMemoryManager& device_memory;
std::vector<std::unique_ptr<ShaderCacheWorker>> workers;
mutable std::mutex lookup_mutex;
std::mutex invalidation_mutex;
std::unordered_map<VAddr, std::unique_ptr<Entry>> lookup_cache;
std::unordered_map<u64, std::vector<Entry*>> invalidation_cache;
std::vector<std::unique_ptr<ShaderInfo>> storage;
std::vector<Entry*> marked_for_removal;
};
ShaderCache::ShaderCache(Tegra::MaxwellDeviceMemoryManager& device_memory_)
: impl{std::make_unique<Impl>(device_memory_)} {}
ShaderCache::~ShaderCache() = default;
void ShaderCache::InvalidateRegion(VAddr addr, size_t size) {
impl->InvalidateRegion(addr, size);
} }
const ShaderInfo* ShaderCache::MakeShaderInfo(GenericEnvironment& env, VAddr cpu_addr) { void ShaderCache::OnCacheInvalidation(VAddr addr, size_t size) {
auto info = std::make_unique<ShaderInfo>(); impl->OnCacheInvalidation(addr, size);
if (const std::optional<u64> cached_hash{env.Analyze()}) { }
info->unique_hash = *cached_hash;
info->size_bytes = env.CachedSizeBytes(); void ShaderCache::SyncGuestHost() {
} else { impl->SyncGuestHost();
// Slow path, not really hit on commercial games }
// Build a control flow graph to get the real shader size
Shader::ObjectPool<Shader::Maxwell::Flow::Block> flow_block; bool ShaderCache::RefreshStages(std::array<u64, 6>& unique_hashes) {
Shader::Maxwell::Flow::CFG cfg{env, flow_block, env.StartAddress()}; return impl->RefreshStages(unique_hashes);
info->unique_hash = env.CalculateHash(); }
info->size_bytes = env.ReadSizeBytes();
} const ShaderInfo* ShaderCache::ComputeShader() {
const size_t size_bytes{info->size_bytes}; return impl->ComputeShader();
const ShaderInfo* const result{info.get()}; }
Register(std::move(info), cpu_addr, size_bytes);
return result; void ShaderCache::GetGraphicsEnvironments(GraphicsEnvironments& result,
const std::array<u64, NUM_PROGRAMS>& unique_hashes) {
impl->GetGraphicsEnvironments(result, unique_hashes);
}
ShaderInfo* ShaderCache::TryGet(VAddr addr) const {
return impl->TryGet(addr);
}
void ShaderCache::Register(std::unique_ptr<ShaderInfo> data, VAddr addr, size_t size) {
impl->Register(std::move(data), addr, size);
} }
} // namespace VideoCommon } // namespace VideoCommon