[squeak-dev] sends of #class and #== considered harmful, may we please stop?

Hi Chris,

I think it should work the other way around. When you want to be able to
use proxies everywhere (which can't work btw, because everywhere is just
too vague), there should be a switch - a preference with an action - that
recompiles all methods with the #== and #class bytescodes disabled except
for those methods which have a special pragma preventing such
deoptimization.

I'm surprised you didn't mention other special methods like #ifTrue:,
#whileTrue:, #repeat:, #to:do:, etc. I think that's because you rarely
create proxies for numbers, booleans and blocks.

Levente

P.S.: Please don't use #species for such thing. Why? #species is supposed
to return "the preferred class for reconstructing the receiver". It is not
a substitute for #class.

Post by Chris Muller
Hi guys,
Something I've been wanting to ask the community for years, but never
had the gumption, was about changing how we write our #= and #hash
methods, but now that we're combing through them anyway, I must!
Basically, it comes down to Proxy's. I want them to work better in
Squeak. But every time we put another send to #class or #== in an
implementation of #=, since those methods are inlined, they're too
brittle to work with Proxy's. This causes hard-to-trackdown bugs in
applications and mini-messes to deal with them since the app is forced
to change its code to become "aware" of potential Proxy'iness of an
object before comparing it with #=.
This is no surprise, since writing a send to #== instead of #= for no
more than "performance" is actually breaking encapsulation in the
first place...
There is an easy solution. When writing #= and #hash implementations,
use (#species or #xxxClass or #isKindOf: instead of #class) and #=
instead #==. So, for example, Eliot, I want to upgrade your new
= anObject
^self xxxClass = anObject xxxClass
and: [selector = anObject selector "selector lookup is by identity"
and: [lookupClass = anObject lookupClass
and: [args literalEqual: anObject arguments]]]
Or #species or #isKindOf:, like we do in many other methods. Now the
method is Proxy-com

Eliot Miranda

2018-11-23 19:01:33 UTC

Post by Levente Uzonyi
Hi Chris,
I think it should work the other way around. When you want to be able to use proxies everywhere (which can't work btw, because everywhere is just too vague), there should be a switch - a preference with an action - that recompiles all methods with the #== and #class bytescodes disabled except for those methods which have a special pragma preventing such deoptimization.

+1 on the idea; there are a few implementation choices

Post by Levente Uzonyi
I'm surprised you didn't mention other special methods like #ifTrue:, #whileTrue:, #repeat:, #to:do:, etc. I think that's because you rarely create proxies for numbers, booleans and blocks.

+1. Again see Marcus’ super cool reification of inlined blocks in Pharo, even if it is slooooow :-)

Post by Levente Uzonyi
Levente
P.S.: Please don't use #species for such thing. Why? #species is supposed to return "the preferred class for reconstructing the receiver". It is not a substitute for #class.

Post by Levente Uzonyi

Chris Muller

2018-11-24 04:19:53 UTC

Post by Levente Uzonyi
I think it should work the other way around. When you want to be able to use proxies everywhere (which can't work btw, because everywhere is just too vague), there should be a switch - a preference with an action - that recompiles all methods with the #== and #class bytescodes disabled except for those methods which have a special pragma preventing such deoptimization.

+1 on the idea; there are a few implementation choices

Guys, your aggressive imagination for how far you could go to to build
out "ultimate transparency" is fun, but recompiling the users entire
image? So every package in the users image is suddenly dirty then?
Maybe I've misunderstood the the idea, but it seems unworkable.
What about development while connected to the DB? When a developer
changes a method after the recompile, when will it get recompiled
again to ignore those bytecodes? What about when I actually want the
Proxy's class itself? Plus, like Eliots suggestion, it's only
class-specific instead of instance-specifc, so it slows things down
even where it didn't need to.

As I'm sure you're aware, singularly pursuing one goal in a complex
system can easily and unwittingly knock down other useful and
necessary aspects of the system.

Why can't we simply let #privClass be the inlined message and #class
return "self privClass"? Then, if we will stop writing code that
makes assumptions about implementations everywhere by using #== or an
IdentityDictionary's because "I know I'm dealing with a Symbol, so
make it 1% faster!!" It seems like these obsessions with optimization
have reached a feverish level its becoming detrimental. If we're
supposed to be OO masters, delegation to #= and letting it inherit the
one in Object is the more proper way to write that.

Besides which, sending inlined messages to a Proxy isn't the whole of
the story, there is also the issue of when a Proxy object (with named
pointers) gets passed as an argument to a primitive that expects a
different class-type (e.g., a Proxy to some Form bytes or something).
Magma deals with this by compiling a subclass and generating overrides
of all methods with primitive sends. It overrides #xxxClass (which I
would be more than happy to rename!) so that equivalence COULD work,
if only we had a wrapping message. Like my idea about #privClass...

+1. Again see Marcus’ super cool reification of inlined blocks in Pharo, even if it is slooooow :-)

Then there's no point in my looking at it. I'm seeking a reasonable
balance of trade-offs and speed is one I need to keep. This is not
just a learning experiment or school project, but real apps for human
end users an

Chris Muller

2018-11-24 04:56:05 UTC

Post by Chris Muller
Why can't we simply let #privClass be the inlined message and #class
return "self privClass"?

Or #basicClass. You were right when you told me #basic is the better
p

Marcus Denker

2018-11-28 15:59:31 UTC

#class does never use the special send, I think #== still does as using it does not do an interrupt check, while it would if it would be a message send it would, which can have very bad side effects especially in
the code for process scheduling…

For compiler pragmas, in Opal we added pragmas, e.g. to turn on #repeat inlining, which is off by default for backward compatibility:

exampleRepeatEffect
<compilerOptions: #(#+ #optionInlineRepeat)>
| i |
i := 1.
[
i := i + 1.
i = 10
ifTrue: [ ^ true ] ] repeat

this can be set per class, too. For example, it was hard-coded deep the compiler that InstructionStream ivar access is compiled specially for Cog. This is now done by overriding #compiler on the class side:

compiler
"The JIT compiler needs to trap all reads to instance variables of contexts. As this check is costly, it is only done
in the long form of the bytecodes, which are not used often. In this hierarchy we force the compiler to alwasy generate
long bytecodes"
^super compiler options: #(+ optionLongIvarAccessBytecodes)

Post by Eliot Miranda
+1 on the idea; there are a few implementation choices

+1. Again see Marcus’ super cool reification of inlined blocks in Pharo, even if it is slooooow :-)

Yes, it is very slow. The goal was more these two:

1) do not loose the clever student in the first lecture. They do the “lets implement 3 value logic” and are *very* upset when they realise that the system is not as nice as they thought :-)
2) understand the problem better

and of course, it is not much code. This is the full code (and it can be turned off by a setting):

mustBeBooleanDeOptimizeIn: context
"Permits to redefine methods inlined by compiler.
Take the ast node corresponding to the mustBeBoolean error, compile it on the fly and executes it as a DoIt. Then resume the execution of the context."

| sendNode methodNode method ret |

"get the message send node that triggered mustBeBoolean"
sendNode := context sourceNode sourceNodeForPC: context pc - 1.

"Rewrite non-local returns to return to the correct context from send"
RBParseTreeRewriter new
replace: '^ ``@value' with: 'ThisContext home return: ``@value';
executeTree: sendNode.

"Build doit node to perform send unoptimized"
methodNode := sendNode copy asDoitForContext: context.

"Keep same compilation context as the sender node's"
methodNode compilationContext: sendNode methodNode compilationContext copy.

"Disable inlining so the message send will be unoptimized"
methodNode compilationContext compilerOptions: #(- optionInlineIf optionInlineAndOr optionInlineWhile).

"Generate the method"
OCASTSemanticCleaner clean: methodNode.
method := methodNode generate.

"Execute the generated method with the pc still at the optimzized block so that the lookUp can read variables defined in the optimized block"
ret := context receiver withArgs: {context} executeMethod: method.

"resume the context at the instruction following the send when returning from deoptimized code."
context pc: sendNode irInstruction nextBytecodeOffsetAfterJump.
^ret.

So this means we do the compilation for every execution. I wonder if this could not be speed up by caching… maybe even one could do somet

Marcus Denker

2018-11-30 12:21:02 UTC

Post by Marcus Denker

Post by Eliot Miranda
+1. Again see Marcus’ super cool reification of inlined blocks in Pharo, even if it is slooooow :-)

1) do not loose the clever student in the first lecture. They do the “lets implement 3 value logic” and are *very* upset when they realise that the system is not as nice as they thought :-)
2) understand the problem better

….

Post by Marcus Denker
So this means we do the compilation for every execution. I wonder if this could not be speed up by caching… maybe even one could do something similar at the JIT level?

I now implemented caching of the compiled DoIt. It adds a method property #mustBeBooleanCache which is a dictionary that stores the doit per pc.
The cached method itself stores the pc after the jump. (#mustBeBooleanJump):

mustBeBooleanDeOptimizeIn: context
"Permits to redefine methods inlined by compiler.
Take the ast node corresponding to the mustBeBoolean error, compile it on the fly and executes it as a DoIt. Then resume the execution of the context.
the generated DoIts are cached in the calling method"

| ret cache method |

cache := context method propertyAt: #mustBeBooleanCache ifAbsentPut: [ IdentityDictionary new ].
"compile a doit method for the unoptimized expression"
method := cache at: (context pc - 1) ifAbsent: [self mustBeBooleanCompileExpression: context andCache: cache ].
"Execute the generated method with the pc still at the optimzized block so that the lookUp can read variables defined in the optimized block"
ret := context receiver withArgs: {context} executeMethod: method.
"resume the context at the instruction following the send when returning from deoptimized code."
context pc: (method propertyAt: #mustBeBooleanJump).
^ret.

with the compilation done in this method:

mustBeBooleanCompileExpression: context andCache: cache
"Permits to redefine methods inlined by compiler.
Take the ast node corresponding to the mustBeBoolean error, compile it on the fly and executes it as a DoIt. Then resume the execution of the context."

| sendNode methodNode pc method pcAfterJump |

"get the message send node that triggered mustBeBoolean"
pc := context pc - 1.
sendNode := context sourceNode sourceNodeForPC: pc.
"Rewrite non-local returns to return to the correct context from send"
RBParseTreeRewriter new
replace: '^ ``@value' with: 'ThisContext home return: ``@value';
executeTree: sendNode.
"Build doit node to perform send unoptimized"
methodNode := sendNode copy asDoitForContext: context.
"Keep same compilation context as the sender node's"
methodNode compilationContext: sendNode methodNode compilationContext copy.
"Disable inlining so the message send will be unoptimized"
methodNode compilationContext compilerOptions: #(- optionInlineIf optionInlineAndOr optionInlineWhile).
"Generate the method"
OCASTSemanticCleaner clean: methodNode.
method := methodNode generate.
"store the pc of the instruction following the send when returning from deoptimized code."
pcAfterJump := sendNode irInstruction nextBytecodeOffsetAfterJump.
method propertyAt: #mustBeBooleanJump put: pcAfterJump.
"cache the method we just created"
cache at: pc put: method.
^method

seems lead to a factor 200 speedup for trivial examples. The “hot” code-path now just gets the method from the cache and executes it,
so all the overhead of compilation. And the mappi

Eliot Miranda

2018-12-01 17:45:18 UTC

Marcus,

Post by Marcus Denker

Post by Eliot Miranda
+1. Again see Marcus’ super cool reification of inlined blocks in Pharo, even if it is slooooow :-)

….

Post by Marcus Denker
So this means we do the compilation for every execution. I wonder if this could not be speed up by caching… maybe even one could do something similar at the JIT level?

I now implemented caching of the compiled DoIt. It adds a method property #mustBeBooleanCache which is a dictionary that stores the doit per pc.
mustBeBooleanDeOptimizeIn: context
"Permits to redefine methods inlined by compiler.
Take the ast node corresponding to the mustBeBoolean error, compile it on the fly and executes it as a DoIt. Then resume the execution of the context.
the generated DoIts are cached in the calling method"
| ret cache method |
cache := context method propertyAt: #mustBeBooleanCache ifAbsentPut: [ IdentityDictionary new ].
"compile a doit method for the unoptimized expression"
method := cache at: (context pc - 1) ifAbsent: [self mustBeBooleanCompileExpression: context andCache: cache ].
"Execute the generated method with the pc still at the optimzized block so that the lookUp can read variables defined in the optimized block"
ret := context receiver withArgs: {context} executeMethod: method.
"resume the context at the instruction following the send when returning from deoptimized code."
context pc: (method propertyAt: #mustBeBooleanJump).
^ret.
mustBeBooleanCompileExpression: context andCache: cache
"Permits to redefine methods inlined by compiler.
Take the ast node corresponding to the mustBeBoolean error, compile it on the fly and executes it as a DoIt. Then resume the execution of the context."
| sendNode methodNode pc method pcAfterJump |
"get the message send node that triggered mustBeBoolean"
pc := context pc - 1.
sendNode := context sourceNode sourceNodeForPC: pc.
"Rewrite non-local returns to return to the correct context from send"
RBParseTreeRewriter new
executeTree: sendNode.
"Build doit node to perform send unoptimized"
methodNode := sendNode copy asDoitForContext: context.
"Keep same compilation context as the sender node's"
methodNode compilationContext: sendNode methodNode compilationContext copy.
"Disable inlining so the message send will be unoptimized"
methodNode compilationContext compilerOptions: #(- optionInlineIf optionInlineAndOr optionInlineWhile).
"Generate the method"
OCASTSemanticCleaner clean: methodNode.
method := methodNode generate.
"store the pc of the instruction following the send when returning from deoptimized code."
pcAfterJump := sendNode irInstruction nextBytecodeOffsetAfterJump.
method propertyAt: #mustBeBooleanJump put: pcAfterJump.
"cache the method we just created"
cache at: pc put: method.
^method
seems lead to a factor 200 speedup for trivial examples. The “hot” code-path now just gets the method from the cache and executes it,
so all the overhead of compilation. And the mapping pc-AST is not needed at runtime.

Bravo! This (the entire scheme including caching) is worth writing up as an example of language extensibility in Smalltalk. You might be able to get it accepted at SPLASH. Personally I love to see coherent use of reflection like

Eliot Miranda

2018-11-23 18:59:21 UTC

Hi Chris,

The disease is the inclining of #class and #==, not the use of these in comparison methods. In the VisualWorks vm I implemented a command-line/image header switch that enabled/disabled this inlining. It is easy to implement the same thing on Cog.

If we did, then these methods would not be inlined if the flag is enabled, on a per-image basis.

An alternative is Marcus Denker’s Opal compiler for Pharo which compiles #class and #== to normal sends, avoiding the inlined special selector bytecodes for these sends. That’s arguably an even better solution, because it retains the ability to inline. Although the mirror methods objectClass: and object:eqeq: can be used, with more overhead.

In any case, instead of proposing to change a natural and comprehensible idiom, we can instead evolve the implementation to meet our needs.

Post by Chris Muller
This is no surprise, since writing a send to #== instead of #= for no
more than "performance" is actually breaking encapsulation in the
first place...
There is an easy solution. When writing #= and #hash implementations,
use (#species or #xxxClass or #isKindOf: instead of #class) and #=
instead #==. So, for example, Eliot, I want to upgrade your new
= anObject
^self xxxClass = anObject xxxClass
and: [selector = anObject selector "selector lookup is by identity"
and: [lookupClass = anObject lookupClass
and: [args literalEqual: anObject arguments]]]
Or #species or #isKindOf:, like we do in many other methods. Now the
method is Proxy-compatible.
What do you think?

I think it is wrong. xxxClass is a horrible hack that should be ban

Chris Muller

2018-11-24 04:04:24 UTC

Hi Eliot,

Post by Chris Muller
Basically, it comes down to Proxy's. I want them to work better in
Squeak. But every time we put another send to #class or #== in an
implementation of #=, since those methods are inlined, they're too
brittle to work with Proxy's. This causes hard-to-trackdown bugs in
applications and mini-messes to deal with them since the app is forced
to change its code to become "aware" of potential Proxy'iness of an
object before comparing it with #=.

Having consistent send properties among all messages would be cool.
Having said that, I do like performance, and there aren't that many
in-lined messages, and experience has trained me to recognize when I
use them and pay attention to the Proxy-dynamics. A "global switch"
approach would force the entire image to incur the overhead, even for
non-DB objects like the Morphs of the UI. A simple #yourself
strategically sprinkled here or there is not such a huge deal for me,
except that a lot of them could be trivially eliminated by nothing
more than a flexibility from my peers about their interpretations of
"equivalence".

Post by Eliot Miranda
An alternative is Marcus Denker’s Opal compiler for Pharo which compiles #class and #== to normal sends, avoiding the inlined special selector bytecodes for these sends. That’s arguably an even better solution, because it retains the ability to inline. Although the mirror methods objectClass: and object:eqeq: can be used, with more overhead.
In any case, instead of proposing to change a natural and comprehensible idiom, we can instead evolve the implementation to meet our needs.

Smalltalk is supposed to be about messages, not types. So why
shouldn't we let the _messages_ determine equivalence instead of
enforcing conformance to some specific "type"? You know we already
have several classes that use #isKindOf: rather than == otherObject
class. So, I think my proposal is natural and comprehensible too,
especially. from the aspect that it avoids the real-world bugs that
can sometimes seem incomprehensible.

Post by Chris Muller
This is no surprise, since writing a send to #== instead of #= for no
more than "performance" is actually breaking encapsulation in the
firs t place...
There is an easy solution. When writing #= and #hash implementations,
use (#species or #xxxClass or #isKindOf: instead of #class) and #=
instead #==. So, for example, Eliot, I want to upgrade your new
= anObject
^self xxxClass = anObject xxxClass
and: [selector = anObject selector "selector lookup is by identity"
and: [lookupClass = anObject lookupClass
and: [args literalEqual: anObject arguments]]]
Or #species or #isKindOf:, like we do in many other methods. Now the
method is Proxy-compatible.
What do you think?

I think it is wrong. xxxClass is a horrible hack that should be banished ASAP.

Other than it having an unattractive name, what is wrong about it?
Smalltalk is supposed to be a full computer implemented in itself, but
you want to introduce a new VM feature and a new global setting that
slows down the entire image and even has to be reckoned with all the
way out in the Linux scripts (for cmdline arg), all just to avoid
writing #= instead o

Levente Uzonyi

2018-11-24 14:26:01 UTC

Post by Chris Muller
Hi Eliot,

Perhaps if the method were called #ensureNonProxiedReceiver instead of
#yourself, your peers would know why it is there.

Post by Eliot Miranda
An alternative is Marcus Denkerâs Opal compiler for Pharo which compiles #class and #== to normal sends, avoiding the inlined special selector bytecodes for these sends. Thatâs arguably an even better solution, because it retains the ability to inline. Although the mirror methods objectClass: and object:eqeq: can be used, with more overhead.
In any case, instead of proposing to change a natural and comprehensible idiom, we can instead evolve the implementation to meet our needs.

"foo isKindOf: Bar" is not a substitute for "foo class == Bar". "foo
isMemberOf: Bar" is what you should use instead (if you really want worse
performance :)).

Levente

Post by Chris Muller
This is no surprise, since writing a send to #== instead of #= for no
more than "performance" is actually breaking encapsulation in the
firs t place...
There is an easy solution. When writing #= and #hash implementations,
use (#species or #xxxClass or #isKindOf: instead of #class) and #=
instead #==. So, for example, Eliot, I want to upgrade your new
= anObject
^self xxxClass = anObject xxxClass
and: [selector = anObject selector "selector lookup is by identity"
and: [lookupClass = anObject lookupClass
and: [args literalEqual: anObject arguments]]]
Or #species or #isKindOf:, like we do in many other methods. Now the
method is Proxy-compatible.
What do you think?

I think it is wrong. xxxClass is a horrible hack that should be banished ASAP.

Chris Muller

2018-11-24 20:06:59 UTC

Post by Levente Uzonyi

Perhaps if the method were called #ensureNonProxiedReceiver instead of
#yourself, your peers would know why it is there.

Actually it is. Pretty close anyway, it's called
#realObjectIfMutatingProxy, but #yourself works too and I was trying
to keep the conversation simpler by sticking with standard familiar
vocabulary.

Post by Levente Uzonyi

"foo isKindOf: Bar" is not a substitute for "foo class == Bar".

We're talking about defining a useful meaning for equivalence between
two complex objects, not making a substitute for identity checks. I
doubt you can articulate as clear a position against #isKindOf: in
that context.

Post by Levente Uzonyi
"foo
isMemberOf: Bar" is what you should use instead (if you really want worse
performance :)).

I'm not sure if you're joking here, or not. :/ I'm open to all
serious sugge

Marcus Denker

2018-11-28 15:45:45 UTC

Post by Levente Uzonyi
Hi Chris,

Another idea might be to have a “slow bit” at a per object level: if that is set, some reifications would be done that normally are not done.
I read about this being done in JavaScript, at least at some point they made proxies practical with that approach.

Post by Levente Uzonyi
An alternative is Marcus Denker’s Opal compiler for Pharo which compiles #class and #== to normal sends, avoiding the inlined special selector bytecodes for these sends. That’s arguably an even better solution, because it retains the ability to inline. Although the mirror methods objectClass: and object:eqeq: can be used, with more overhead.

Do we really still need the speed we get from the special byte code? I am not sure if I remember correctly, but I think for #class this was already 2004 that Andreas discussed that in Bern when he visited, we even benchmarks and decided that it would be ok to not have a special #class bytecode without too much bad performance impact.

And it is late-binding something that is not late bound now. I always liked that as a “guid

Denis Kudriashov

2018-11-23 21:43:33 UTC