c# - Linq performance: should I first use `where` or `select` -
i have large list
in memory, class has 20 properties
.
i'd filter list based on 1 property
, particular task need list of property
. query like:
data.select(x => x.field).where(x => x == "desired value").tolist()
which 1 gives me better performance, using select
first, or using where
?
data.where(x => x.field == "desired value").select(x => x.field).tolist()
please let me know if related data type
i'm keeping data in memory, or field's type. please note need these objects other tasks too, can't filter them in first place , before loading them memory.
which 1 gives me better performance, using select first, or using where.
where
first approach more performant, since filters collection first, , executes select
filtered values only.
mathematically speaking, where
-first approach takes n + n'
operations, n'
number of collection items fall under where
condition.
so, takes n + 0 = n
operations @ minimum (if no items pass where
condition) , n + n = 2 * n
operations @ maximum (if items pass condition).
at same time, select
first approach take 2 * n
operations, since iterates through objects acquire property, , iterates through objects filter them.
benchmark proof
i have completed benchmark prove answer.
results:
condition value: 50 -> select: 88 ms, 10500319 hits select -> where: 137 ms, 20000000 hits condition value: 500 -> select: 187 ms, 14999212 hits select -> where: 238 ms, 20000000 hits condition value: 950 -> select: 186 ms, 19500126 hits select -> where: 402 ms, 20000000 hits
if run benchmark many times, see where -> select
approach hits change time time, while select -> where
approach takes 2n
operations.
ideone demonstration:
code:
class point { public int x { get; set; } public int y { get; set; } } class program { static void main() { var random = new random(); list<point> points = enumerable.range(0, 10000000).select(x => new point { x = random.next(1000), y = random.next(1000) }).tolist(); int conditionvalue = 250; console.writeline($"condition value: {conditionvalue}"); stopwatch sw = new stopwatch(); sw.start(); int hitcount1 = 0; var points1 = points.where(x => { hitcount1++; return x.x < conditionvalue; }).select(x => { hitcount1++; return x.y; }).toarray(); sw.stop(); console.writeline($"where -> select: {sw.elapsedmilliseconds} ms, {hitcount1} hits"); sw.restart(); int hitcount2 = 0; var points2 = points.select(x => { hitcount2++; return x.y; }).where(x => { hitcount2++; return x < conditionvalue; }).toarray(); sw.stop(); console.writeline($"select -> where: {sw.elapsedmilliseconds} ms, {hitcount2} hits"); console.readline(); } }
related questions
these questions can interesting you. not related select
, where
, linq order performance:
does order of linq functions matter?
order of linq extension methods not affect performance?
Comments
Post a Comment