-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StreamReader.ReadLineAsync performance can be improved #62061
Comments
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsDescriptionStream.ReadLineAsync is 6 times slower than Stream.ReadLine in .NET 6. Slightly simpler case is 24 times slower. Reproduction StepsSync version
Async version:
Sync version finishes in 0.11 seconds, Async finishes in 1 minute and 5 seconds And we take this shorter sync version:
and async version:
Sync version took 17 seconds. Async version took 5 minutes and 35 seconds. It is 24 times slower. Expected behaviorI believe that async version should be slower than sync but not by that much. Especially after I read that there is a lot of improvements of async FileStream in .NET 6 Actual behaviorStream.ReadLineAsync is 6 times slower than Stream.ReadLine Regression?No response Known WorkaroundsNo response Configuration.NET 6 File Other informationI put it as bug report because a lot of developers are really sure that this is perfect case for async and as result they are using async in this case, making code quite slow. It also forces us in some cases using sync version which does not feel right.
|
Additional overhead may be included in your measurement, for example initializing the thread pool. You should use BenchmarkDotNet to get a more reliable measurement. Async version is expected to be slightly slower. |
@vladimir-cheverdyuk-altium can you make sure you're running your app in release. In using System.Diagnostics;
const string FILE_PATH = "C:\\dev\\big_file.txt";
using (var file = File.CreateText(FILE_PATH))
{
const string line = "The quick brown fox jumps over the lazy dog.";
for (var i = 0; i < 10_000_000; ++i)
{
file.WriteLine(line);
}
}
var sw = Stopwatch.StartNew();
using (var stream = new FileStream(FILE_PATH, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 32 * 1024, true))
{
using (var reader = new StreamReader(stream))
{
while (true)
{
var read = await reader.ReadLineAsync();
if (read == null) break;
}
}
}
Console.WriteLine("Async Version: " + sw.Elapsed);
sw = Stopwatch.StartNew();
using (var stream = new FileStream(FILE_PATH, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 32 * 1024, false))
{
using (var reader = new StreamReader(stream))
{
while (true)
{
var read = reader.ReadLine();
if (read == null) break;
}
}
}
Console.WriteLine("Sync Version: " + sw.Elapsed); The above code executed with Async Version: 00:00:01.4220594
Sync Version: 00:00:00.7919433 @huoyaoyuan although |
@zlatanov, I re-run my tests as you suggested. First case is 11 seconds vs 20. Second simpler case is 14 seconds vs 30. Still about 2 times difference. It is much better, but I expected maybe 10-20% difference due to more complex code in async version. Sorry, I did testing from VS 2022 in Release configuration. I know that debugger affects performance but I expected that both versions would be affected similarly I didn't expect that debugger affects async version that much :( |
That is why we encourage our users to use BenchmarkDotNet for benchmarking (one of the things it does it enforces running in Release). I've written a following benchmark based on the source code provided by @zlatanov : <Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFrameworks>net5.0;net6.0</TargetFrameworks>
<LangVersion>9.0</LangVersion>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="BenchmarkDotNet" Version="0.13.1" />
</ItemGroup>
</Project> using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.IO;
using System.Threading.Tasks;
namespace ReadLinesPerf
{
internal class Program
{
static void Main(string[] args) => BenchmarkRunner.Run<Benchmarks>(args: args);
}
public class Benchmarks
{
const string FilePath = "big_file.txt";
[GlobalSetup]
public void Setup()
{
using StreamWriter writer = File.CreateText(FilePath);
for (int i = 0; i < 10_000_000; ++i)
writer.WriteLine("The quick brown fox jumps over the lazy dog.");
}
[GlobalCleanup]
public void Cleanup() => File.Delete(FilePath);
[Benchmark]
public void ReadLine()
{
using FileStream stream = new (FilePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 1, false);
using StreamReader reader = new (stream, bufferSize: 32 * 1024);
while (reader.ReadLine() != null) ;
}
[Benchmark]
public async Task ReadLineAsync()
{
using FileStream stream = new (FilePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 1, true);
using StreamReader reader = new (stream, bufferSize: 32 * 1024);
while (await reader.ReadLineAsync() != null) ;
}
}
} And run it for .NET 5 and 6 using the following command: dotnet run -c Release -f net5.0 --runtimes net5.0 net6.0 --memory And got the following results: BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
AMD Ryzen Threadripper PRO 3945WX 12-Cores, 1 CPU, 24 logical and 12 physical cores
.NET SDK=6.0.100
[Host] : .NET 5.0.12 (5.0.1221.52207), X64 RyuJIT
Job-BGKKLI : .NET 5.0.12 (5.0.1221.52207), X64 RyuJIT
Job-BUOARD : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT
My observations:
I had a quick look at the implementation and there is definitely place for improvement. I am going to change the issue title (there is no such thing as Stream.ReadLineAsync) and it's 2 (not 6) times slower and make it up-for-grabs. |
@adamsitnik Have you already profiled ReadLineAsync and do you know where the bottle neck is? |
@deeprobin I've not profiled it yet. |
I want to work on that issue and improve the performance. Haven't commited to the runtime so I don't really know the rules, going to look to the docs first and then profile it. |
I want to work on that issue and improve the performance. Haven't commited to the runtime so I don't really know the rules, going to look to the docs first and then profile it. @Trapov great! I've assigned you to the issue. You may find the following docs useful: |
@adamsitnik Can I reuse the same benchmark that you posted above in the performance repo? |
@Trapov sure! |
Before:
After:
I think I could squeeze out more. @adamsitnik |
@Trapov impressive! |
May I ask why there is so big difference when I ran these tests from Visual Studio? After all, most development is happened in Visual Studio and somebody could replace sync code with async, will get so awful performance and will decide that it is not worth it and will revert back to sync version. |
With or without the debugger attached? A debugger attached makes lots of things slower, in particular related to threading. |
With debugger attached. I understand that performance will be slower but in this case it is up to 12 times slower. Is it reasonable? |
Results from .NET7 using the benchmark above. BenchmarkDotNet=v0.13.4, OS=Windows 10 (10.0.19044.2251/21H2/November2021Update)
Intel Core i7-8565U CPU 1.80GHz (Whiskey Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.100-rc.1.22431.12
[Host] : .NET 7.0.0 (7.0.22.42610), X64 RyuJIT AVX2
DefaultJob : .NET 7.0.0 (7.0.22.42610), X64 RyuJIT AVX2
|
#69888 was merged for .NET 8. |
Description
Stream.ReadLineAsync is 6 times slower than Stream.ReadLine in .NET 6. Slightly simpler case is 24 times slower.
Reproduction Steps
Sync version
Async version:
Sync version finishes in 11 seconds, Async finishes in 1 minute and 5 seconds
And we take this shorter sync version:
and async version:
Sync version took 17 seconds. Async version took 5 minutes and 35 seconds. It is 24 times slower.
Expected behavior
I believe that async version should be slower than sync but not by that much. Especially after I read that there is a lot of improvements of async FileStream in .NET 6
Actual behavior
Stream.ReadLineAsync is 6 times slower than Stream.ReadLine
Regression?
No response
Known Workarounds
No response
Configuration
.NET 6
Windows 10, Pro, x64
x86
Intel 9900K, 32GB RAM
File
E:\Temp\file8.txt.bak
located on HDD and is 8Gb. It contains only English alpha numeric characters, spaces and periods.For example:
4815995. Something something something
Other information
I put it as bug report because a lot of developers are really sure that this is perfect case for async and as result they are using async in this case, making code quite slow. It also forces us in some cases using sync version which does not feel right.
The text was updated successfully, but these errors were encountered: