2007-08-03 18:21 | fche blog tech systemtap safety
Sun’s Bryan Cantrill still pulls out this old chestnut about why dtrace is supposed to be the one true way of instrumenting a system.
It makes some good points, but goes off on a wild strawman chase at this point:
For us, the answer was so clear that it was almost unspoken: we knew that we needed to develop a virtual machine that could act as a target instruction set for a custom compiler. Why was this the clear choice? Because the alternative — to execute user-specified code natively in the kernel — is untenable from a safety perspective.
This false dichotomy overlooks a third possibility – the one adopted by systemtap. Namely, we don’t run user-specified code natively in the kernel: we run code that is synthesized by our tool. This synthesized code includes much the same control/data checks as the dtrace virtual machine has to have. Actually, it has more, since our input scripting language is significantly more expressive.
Accepting these facts, one would have to dig deeper for essential differences – and deeper yet to exclude those that represent bugs and incomplete work rather than architectural limitations. But such a comparison takes more work.
Via |
Trackback link:
Please enable javascript to generate a trackback url
It’s not at all a false dichotomy. What, pray tell, is the privilege model in your world? With DTrace, administrators can safely dole out fine-grained privileges to otherwise untrusted users, knowing that they can do no harm to the system. With SystemTap, you will always require the entity generating the object code to be running as superuser, lest you open yourself up to attacks of all flavors.
And while the inability to accommodate fine-grained privileges remains the most obvious shortcoming of the SystemTap model, it is by no means the only one: you have made the process of validating your safety infinitely more difficult (if not impossible) by placing the compiler in the safety-critical path. For example: you are presumably (hopefully?) using compiler options to prevent the generation of floating point operations. But what if those options should change or shift? Or what if there is some code path that will generate floating point operations despite it. Assuming Linux is like most other operating systems and doesn’t allow floating point in the kernel, this failure will result in user-level data corruption as you plow someone’s floating point registers. And best of luck debugging that one.
Bryan Cantrill (Email) (URL) - 2007-08-03 18:59
Frank,
Over on Eugene’s blog you mentioned that there were technical reasons why DTrace couldn’t be ported to Linux: http://eugeneteo.livejournal.com/8911.ht..
What are those reasons?
Also, are you suggesting that the comparison you linked to presents an objective and fair technical comparison? Note that ‘no’ doesn’t appear in the left column, while ‘not yet’, ‘soon’, and ‘for now’ don’t appear in the right column.
Adam Leventhal (URL) - 2007-08-03 19:46
Adam, regarding the chart, the difference is that we perceive dtrace as “finished” and systemtap as “under development”. Regarding technical reasons, code is indeed just code, but never underestimate the implications of a highly fluid community-managed versus a centrally controlled, slowly evolving kernel.
Frank - 2007-08-03 19:56
Bryan, yes, the compiler/toolchain is placed within the trusted base. This has obvious implications for unprivileged users running scripts of their own. As you know, we have not finished this part, but we have some ideas. Keep listening in on the project mailing list.
As for floating point and such stuff, it may comfort the reader to know that systemtap modules are built exactly the same was as all the other modules that form the kernel – down to invoking the same makefiles. We won’t invoke floating point, and use no compiler that does so gratuitiously. Was this sort of thing ever a problem during dtrace development?
Frank - 2007-08-03 20:11
Frank, both the perception that DTrace is finished and the perception that OpenSolaris is slow-moving are just false. In the past few years Solaris has been the source of dozens of innovative technologies. You may also recall where the inspiration for DTrace came from…
You may also want to take a look at the changes that have happened in DTrace since you started working on SystemTap — it’s hardly a stationary target.
Adam Leventhal (URL) - 2007-08-03 20:17
Frank, Of course it wasn’t a problem during DTrace development because, um, we don’t execute native code. But I love your “we have not finished this part” excuse — and more generally your “not yet” and “soon” attitude towards aspects of the problem that you are not only not thinking about, but have actively made worse by poor design decisions. “Not yet” and “soon” work well when a project is young, but SystemTap is no longer a spring chicken: you’ve now been around longer than DTrace was before we shipped (and despite the claims you’ve made from time to time, we had fewer people working on DTrace than you have had on SystemTap). But keep up with the “soon”, even when you know damn well that it’s not coming “soon” — it fits very nicely with the pattern of intellectual dishonesty that has become the hallmark of the SystemTap project.
Bryan Cantrill (Email) (URL) - 2007-08-03 20:28
Bryan: there is no need for such tone and puerile accusations. Goodbye.
Adam: have there been any dtrace changes that would invalidate that comparison table?
Frank - 2007-08-03 20:51
Frank, to be honest, I don’t think that comparison table was ever particularly accurate (a fact which James Dickens tried to point out to you a year ago). If you give me an account on the wiki, I’d be happy to amend it. That said, was it really intended to be a fair and unbiased comparison? It’s amusing that ‘probing JVM’ (which took months to implement) would carry as much weight as ‘division-by-zero protection’ (about 2 minutes).
I wouldn’t be so quick to dismiss my colleague’s comments. If you feel that your work has been slighted, then bring evidence to counter those claims, but I’m sure you can understand — for example — how one could be of the opinion that SystemTap was unsafe to use in production: there are literally dozens of posts that would give one that impression (not to mention the comparison chart).
On the other hand: assertions about DTrace’s capabilities which are essentially indefensible and unabashed plagiarism are a bit more difficult to understand.
Adam Leventhal (URL) - 2007-08-03 22:40
... and in case you missed it, I’m still curious about the technical hurdles to porting DTrace to Linux to which you alluded: http://eugeneteo.livejournal.com/8911.ht..
Adam Leventhal (URL) - 2007-08-03 23:12
The wiki is public (you can make yourself an account), and you’re welcome to correct
any genuine errors of fact, but editorializing belongs at your own venue. Mentioning
division-by-zero is kind of funny actually, since I never thought it was that noteworthy,
but various dtrace fans made a big deal of it, so I put it in for jest.
Regarding plagiarism and whatnot – we could waste time and analyze the depth of each
alleged transgression. Yes, maybe systemtap is both too much and too little like
dtrace … or something. The bottom line is that implementation-wise, the projects
have approximately nothing in common, and for a variety of reasons, this is likely
to continue as both projects mature.
Frank - 2007-08-03 23:49
I made an account, but I still can’t seem to edit it. Once that problem is corrected, would it be inappropriate to remove existing editorializing?
If you’re not planning on addressing the other points above could you let me know so I can stop reminding you? Thanks.
Adam Leventhal (URL) - 2007-08-04 00:19
> [...] would it be inappropriate to remove existing editorializing?
It should not be necessary to spell out appropriate etiquette for
a guest from competing (?) project being invited to check for
factual errors on our wiki.
> If you’re not planning on addressing the other points above [...]
You mean listing the likely obstacles for a dtrace linux port?
I’m sorry, that’s a homework question.
Frank - 2007-08-04 00:37
Thanks for giving me the opportunity to address some of the problematic spots in that comparison. I took at crack at — let me know if I strayed from the appropriate etiquette. Would it be appropriate to add entries for languages in addition to Java? I thought that might be too aggressive, but it would probably be interesting to know SystemTap’s plans for dynamic language instrumentation beyond just Java.
I couldn’t understand what was meant by a few of the items: end-user extendable probe library, kernel coupling (“upstream” lock-step), binary tracing, context pointer type punning
Adam Leventhal (URL) - 2007-08-04 01:13
Adam, thank you for your reasonableness in clarifying the wiki page.
I’m sure there will remain some contentious items, mostly on account
of contrasting areas of technical emphasis, but that’s the way it
goes.
I hope that this sort of civilized interchange will continue and
results in cooling down the wholly unnecessary enmity.
Frank - 2007-08-04 09:44
We’re all for cooling down unnecessary enmity — but outrage at blatant plagiarism by one of your team hardly strikes me as “unnecessary.” If you would like “civilized interchange”, start by doing the Right Thing: have Eugene Teo apologize on his blog for having plagiarized the DTrace wiki, and acknowledge yourself that the attempt to scrub DTrace from the history of SystemTap was ill advised. If you are willing to do this, we will acknowledge it and link to it on our blogs, and we can begin to collectively repair the violation of trust…
Bryan Cantrill (Email) (URL) - 2007-08-04 11:54
Bryan: “blatant plagiarism” of a single similar page of a presentation
is regrettable, but calling it a “plagiarism of the wiki” is surely an
exaggeration. Regarding scrubbing history, if you read the changelog
carefully, it was a line removed from the man page only. If we actually
cared to excise the fact of inspiration (whatever that’s worth), we’d actually
carry it out without leaving permanent public traces of it. As for the
duplicate udpstat name and trace format – dtt should have been credited as a
model. I’ll fix that. Regarding “not recommending dtrace”, you’ve just
managed to misread a non-native english speaker’s talk based upon a
headline.
I can’t speak for the whole team, but it’s clear some of these
were mistakes. However, blowing them way out of proprtion, to
impugn the “intellectual honesty” of the group, and to pretend that
any of this is somehow relevant to the technical issues is not any better.
Frank - 2007-08-04 12:20
“Regrettable”? We might have different standards for personal integrity if you treat transgressions of trust as merely “regrettable”. (At the very least, I would expect you to label it as “inappropriate”, if not “unacceptable” — and I personally have much stronger labels for acts of intellectual theft.) So if you believe that acts of plagiarism are inappropriate and unacceptable, have Eugene publicly apologize, we’ll acknowledge it, and we can move on.
Bryan Cantrill (Email) (URL) - 2007-08-04 12:54
Eugene will act for himself as he sees fit, but your use of silly terms like “theft” over
four or five lines of text is unlikely to merit a response of corresponding severity.
Frank - 2007-08-04 13:23
“Theft” is not a “silly” term — it’s a serious charge that I reserve for things like, well, acts of plagiarism. In terms of having Eugene try to set this right: do you lack the authority or the will to get him to act?
Bryan Cantrill (Email) (URL) - 2007-08-04 13:30
Bryan, I see no point in continuing this discussion, when you act so offended over someone
else’s four lines, and yet casually throw around libel like “pattern of dishonesty …
that is the hallmark of the systemtap project”. Enough.
Frank - 2007-08-04 13:40