-
Notifications
You must be signed in to change notification settings - Fork 3.1k
SI-7149 Use a WeakHashSet for type uniqueness #2605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Review @gkossakowski. |
Cleaned up several typos in comments. |
I thought we were not using any form of WeakReferences in uniques set thus we were accumulating those types for the entire compiler run. If you look at the diff in 4eb4aa5 you'll see that there are no WeakReferences involved there. Am I missing something? |
You're right, my commit message is wrong. |
@JamesIry: could you update the commit and PR message? |
Replaces scala.reflect.internal.WeakHashSet with a version that * extends the mutable.Set trait * doesn't leak WeakReferences * is unit tested
Currently type uniqueness is done via a HashSet[Type], but that means the Types live through an entire compile session, even ones that are used once. The result is a huge amount of unnecessarily retained memory. This commit uses a WeakHashSet instead so that Types and their WeakReferences are cleaned up when no longer in use.
perRunCaches was using a HashMap of WeakReferences which meant it would accumulate WeakReferences over time. This commit uses a WeakHashSet instead so that the references are cleaned up.
Cleaned up the commit message and added a comment to the WeakHashSet's hash avalanche mechanism. |
LGTM. Great work! I did memory profiling of this change. I'm comparing b88801f (parent commit of this PR) and 29babb5 (last commit in this PR). I used YourKit and custom probe to collect memory snapshot after different phases. See YourKit documentation for detailed explanation of probes. I compiled Scala library with probes enabled like this:
Now the numbers (for compiling the Scala library)
Great work, again! |
SI-7149 Use a WeakHashSet for type uniqueness
You got to be kidding me!! |
FANTASTIC |
Wow, the benchmark result is TOTALLY awesome. Great work! |
excellent job! |
👍 |
/** A bare-bones implementation of a mutable `Set` that uses weak references | ||
* to hold the elements. | ||
/** | ||
* A HashSet where the elements are stored weakly. Elements in this set are elligible for GC if no other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: eligible
Backport #2605 to 2.10.x: SI-7149 Use a WeakHashSet for type uniqueness
This mimics what Scala 2 has been doing for a long time now and serves the same purpose: it considerably reduces peak memory usage when compiling some projects, for example previously compiling the Scalatest tests required a heap of at least 11 GB, but now it fits in about 4 GB. This required changing the implementation of WeakHashSet to have overridable `hash` and `isEqual` methods just like HashSet, it also required making various private methods protected since NamedTypeUniques and AppliedUniques contain an inlined implementation of `put`. This commit also changes the default load factor of a WeakHashSet from 0.75 to 0.5 to match the load factor we use for HashSets, though note that Scala 2 has always been using 0.75. For a history of the usage of WeakHashSet in Scala 2 see: - scala/scala#247 - scala/scala#2605 - scala/scala#2901
This mimics what Scala 2 has been doing for a long time now and serves the same purpose: it considerably reduces peak memory usage when compiling some projects, for example previously compiling the Scalatest tests required a heap of at least 11 GB, but now it fits in about 4 GB. This required changing the implementation of WeakHashSet to have overridable `hash` and `isEqual` methods just like HashSet, it also required making various private methods protected since NamedTypeUniques and AppliedUniques contain an inlined implementation of `put`. This commit also changes the default load factor of a WeakHashSet from 0.75 to 0.5 to match the load factor we use for HashSets, though note that Scala 2 has always been using 0.75. For a history of the usage of WeakHashSet in Scala 2 see: - scala/scala#247 - scala/scala#2605 - scala/scala#2901
This mimics what Scala 2 has been doing for a long time now and serves the same purpose: it considerably reduces peak memory usage when compiling some projects, for example previously compiling the Scalatest tests required a heap of at least 11 GB, but now it fits in about 4 GB. This required changing the implementation of WeakHashSet to have overridable `hash` and `isEqual` methods just like HashSet, it also required making various private methods protected since NamedTypeUniques and AppliedUniques contain an inlined implementation of `put`. This commit also changes the default load factor of a WeakHashSet from 0.75 to 0.5 to match the load factor we use for HashSets, though note that Scala 2 has always been using 0.75. For a history of the usage of WeakHashSet in Scala 2 see: - scala/scala#247 - scala/scala#2605 - scala/scala#2901
We have an implementation of WeakHashSet in the compiler but it never cleans up its WeakReferences. The first commit in this PR replaces the implementation with one that does clean up. Along the way it modifies the interface to follow the standard scala.collection.mutable.Set interface so that it can eventually be moved into the main library should we choose to do so.
Currently type uniqueness is done via a HashSet[Type], but
that means the Types live through an entire compile session, even
ones that are used once. The result is a huge amount of unnecessarily
retained memory. This commit uses a WeakHashSet instead so that Types
and their WeakReferences are cleaned up when no longer in use.
Finally, perRunCaches is also updated to use the WeakHashSet instead of a HashSet of WeakReferences because the former was never cleaning up unused WeakReferences.