JS有名函数表达式全面解析_javascript技巧
时间:2021-07-01 10:21:17
帮助过:6人阅读
Example #1: Function expression identifier leaks into an enclosing scope
实例1:函数表达式标示符渗进了外围作用域
var f = function g(){};
typeof g; // "function"
Remember how I mentioned that an identifier of named function expression is not available in an enclosing scope? Well, JScript doesn't agree with specs on this one - g in the above example resolves to a function object. This is a most widely observed discrepancy. It's dangerous in that it inadvertedly pollutes an enclosing scope - a scope that might as well be a global one - with an extra identifier. Such pollution can, of course, be a source of hard-to-track bugs.
我刚才提到过一个有名函数表达式的标示符不能在外部作用域中被访问。但是,JScript在这点上和标准并不相符,在上面的饿例子中g却是一个函数 对象。这个是一个可以广泛观察到的差异。这样它就用一个多余的标示符污染了外围作用域,这个作用域很有可能是全局作用域,这样是很危险的。当然这个污染可 能是一个很难去处理和跟踪的bug的根源
Example #2: Named function expression is treated as BOTH - function declaration AND function expression
实例2:有名函数表达式被进行了双重处理,函数表达式和函数声明
typeof g; // "function"
var f = function g(){};
As I explained before, function declarations are parsed foremost any other expressions in a particular execution context. The above example demonstrates how JScript actually treats named function expressions as function declarations. You can see that it parses g before an “actual declaration” takes place.
正如我前面解释的,在一个特定的执行环境中,函数声明是在所有的表达式之前被解释。上面的例子说明JScript实际上把有名函数表达式作为一个函 数声明来对待。我们可以看到他在一个实际的声明之前就被解释了。
This brings us to a next example:
在此基础上我们引入了下面的一个例子。
Example #3: Named function expression creates TWO DISCTINCT function objects!
实例3:有名函数表达式创建两个不同的函数对象。
var f = function g(){};
f === g; // false
f.expando = 'foo';
g.expando; // undefined
This is where things are getting interesting. Or rather - completely nuts. Here we are seeing the dangers of having to deal with two distinct objects - augmenting one of them obviously does not modify the other one; This could be quite troublesome if you decided to employ, say, caching mechanism and store something in a property of f, then tried accessing it as a property of g, thinking that it is the same object you're working with.
在这里事情变得更加有趣了,或者是完全疯掉。这里我们看到必须处理两个不同的对象的危险,当扩充他们当中的一个的时候,另外一个不会相应的改变。如 果你打算使用cache机制并且在f的属性中存放一些东西,只有有试图在g的属性中访问,你本以为他们指向同一个对象,这样就会变得非常麻烦
Let's look at something a bit more complex.
让我们来看一些更复杂的例子。
Example #4: Function declarations are parsed sequentially and are not affected by conditional blocks
实例4:函数声明被顺序的解释,不受条件块的影响
var f = function g() {
return 1;
};
if (false) {
f = function g(){
return 2;
}
};
g(); // 2
An example like this could cause even harder to track bugs. What happens here is actually quite simple. First, g is being parsed as a function declaration, and since declarations in JScript are independent of conditional blocks, g is being declared as a function from the “dead” if branch - function g(){ return 2 }. Then all of the “regular” expressions are being evaluated and f is being assigned another, newly created function object to. “dead” if branch is never entered when evaluating expressions, so f keeps referencing first function - function g(){ return 1 }. It should be clear by now, that if you're not careful enough, and call g from within f, you'll end up calling a completely unrelated g function object.
像这样的一个例子可能会使跟踪bug非常困难。这里发生的问题却非常简单。首先g被解释为一个函数声明,并且既然JScript中的声明是和条件块 无关的,g就作为来自于已经无效的if分支中的函数被声明function g(){ return 2 }。之后普通的表达式被求值并且f被赋值为另外一个新创建的函数对象。当执行表达式的时候,由于if条件分支是不会被进入的,因此f保持为第一函数的引用 function g(){ return 1 }。现在清楚了如果不是很小心,而且在f内部调用g,你最终将调用一个完全无关的g函数对象。
You might be wondering how all this mess with different function objects compares to arguments.callee. Does callee reference f or g? Let's take a look:
你可能在想不从的函数对象和arguments.callee相比较的结果会是怎样呢?callee是引用f还是g?让我们来看一下
var f = function g(){
return [
arguments.callee == f,
arguments.callee == g
];
};
f(); // [true, false]
As you can see, arguments.callee references same object as f identifier. This is actually good news, as you will see later on.
我们可以看到arguments.callee引用的是和f标示符一样的对象,就像稍后你会看到的,这是个好消息
Looking at JScript deficiencies, it becomes pretty clear what exactly we need to avoid. First, we need to be aware of a leaking identifier (so that it doesn't pollute enclosing scope). Second, we should never reference identifier used as a function name; A troublesome identifier is g from the previous examples. Notice how many ambiguities could have been avoided if we were to forget about g's existance. Always referencing function via f or arguments.callee is the key here. If you use named expression, think of that name as something that's only being used for debugging purposes. And finally, a bonus point is to always clean up an extraneous function created erroneously during NFE declaration.
既然看到了JScript的缺点,我们应该避免些什么就非常清楚了。首先,我们要意识到标示符的渗出(以使得他不会污染外围作用域)。第二点,我们 不应该引用作为函数名的标示符;从前面的例子可以看出g是一个问题多多的标示符。请注意,如果我们忘记g的存在,很多歧义就可以被避免。通常最关键的就是 通过f或者argument.callee来引用函数。如果你使用有名的表达式,记住名字只是为了调试的目的而存在。最后,额外的一点就是要经常清理有名 函数表达式声明错误创建的附加函数
I think last point needs a bit of an explanation:
我想最有一点需要一些更多解释
JScript 内存管理
Being familiar with JScript discrepancies, we can now see a potential problem with memory consumption when using these buggy constructs. Let's look at a simple example:
熟悉了JScript和规范的差别,我们可以看到当使用这些有问题的结构的时候,和内存消耗相关的潜在问题
var f = (function(){
if (true) {
return function g(){};
}
return function g(){};
})();
We know that a function returned from within this anonymous invocation - the one that has g identifier - is being assigned to outer f. We also know that named function expressions produce superfluous function object, and that this object is not the same as returned function. The memory issue here is caused by this extraneous g function being literally “trapped” in a closure of returning function. This happens because inner function is declared in the same scope as that pesky g one. Unless we explicitly break reference to g function it will keep consuming memory.
我们发现从匿名调用中返回的一个函数,也就是以g作为标示符的函数,被复制给外部的f。我们还知道有名函数表达式创建了一个多余的函数对象,并且这 个对象和返回的对象并不是同一个函数。这里的内存问题就是由这个没用的g函数在一个返回函数的闭包中被按照字面上的意思捕获了。这是因为内部函数是和可恶 的g函数在同一个作用域内声明的。除非我们显式的破坏到g函数的引用,否则他将一直占用内存。
var f = (function(){
var f, g;
if (true) {
f = function g(){};
}
else {
f = function g(){};
}
//给g赋值null以使他不再被无关的函数引用。
//null `g`, so that it doesn't reference extraneous function any longer
g = null;
return f;
})();
Note that we explicitly declare g as well, so that g = null assignment wouldn't create a global g variable in conforming clients (i.e. non-JScript ones). By nulling reference to g, we allow garbage collector to wipe off this implicitly created function object that g refers to.
注意,我们又显式的声明了g,所以g=null赋值将不会给符合规范的客户端(例如非JScirpt引擎)创建一个全局变量。通过给g以null的 引用,我们允许垃圾回收来清洗这个被g所引用的,隐式创建的函数对象。
When taking care of JScript NFE memory leak, I decided to run a simple series of tests to confirm that nulling g actually does free memory.
当考虑 JScript的有名函数表达式的内存泄露问题时,我决定运行一系列简单的测试来证实给g函数null的引用实际上可以释放内存
测试
The test was simple. It would simply create 10000 functions via named function expressions and store them in an array. I would then wait for about a minute and check how high the memory consumption is. After that I would null-out the reference and repeat the procedure again. Here's a test case I used:
这个测试非常简单。他将通过有名函数表达式创建1000个函数,并将它们储存在一个数组中。我等待了大约一分钟,并查看内存使用有多高。只有我们加 上null引用,重复上述过程。下面就是我使用的一个简单的测试用例
function createFn(){
return (function(){
var f;
if (true) {
f = function F(){
return 'standard';
}
}
else if (false) {
f = function F(){
return 'alternative';
}
}
else {
f = function F(){
return 'fallback';
}
}
// var F = null;
return f;
})();
}
var arr = [ ];
for (var i=0; i<10000; i++) {
arr[i] = createFn();
}
Results as seen in Process Explorer on Windows XP SP2 were:
结果是在Windows XP SP2进行的,通过进程管理器得到的
IE6:
without `null`: 7.6K -> 20.3K
with `null`: 7.6K -> 18K
IE7:
without `null`: 14K -> 29.7K
with `null`: 14K -> 27K
The results somewhat confirmed my assumptions - explicitly nulling superfluous reference did free memory, but the difference in consumption was relatively insignificant. For 10000 function objects, there would be a ~3MB difference. This is definitely something that should be kept in mind when designing large-scale applications, applications that will run for either long time or on devices with limited memory (such as mobile devices). For any small script, the difference probably doesn't matter.
结果在一定程度上证实了我的假设,显示的给无用的参考以null值确实会释放内存,但是在内寸的消耗的区别上貌似不是很大。对于1000个函数对 象,大约应该有3M左右的差别。但是有一些是明确的,在设计大规模的应用的时候,应用要不就是要运行很长时间的或者要在一个内存有限的设备上(例如移动设 备)。对于任何小的脚本,差别可能不是很重要。
You might think that it's all finally over, but we are not just quite there yet :) There's a tiny little detail that I'd like to mention and that detail is Safari 2.x
你可以认为这样就可以结束了,但是还没到结束的时候。我还要讨论一些小的细节,而且这些细节是在Safari 2.x下的
Safari bug
Even less widely known bug with NFE is present in older versions of Safari; namely, Safari 2.x series. I've seen some claims on the web that Safari 2.x does not support NFE at all. This is not true. Safari does support it, but has bugs in its implementation which you will see shortly.
虽然没有被人们发现在早期的Safari版本,也就是Safari 2.x版本中有名函数表达式的bug。但是我在web上看到一些声称Safari 2.x根本不支持有名函数表达式。这不是真的。Safari的确支持有名函数表达式,但是稍后你将看到在它的实现中是存在bug的
When encountering function expression in a certain context, Safari 2.x fails to parse the program entirely. It doesn't throw any errors (such as SyntaxError ones). It simply bails out:
在某些执行环境中遇到函数表达式的时候,Safari 2.x 将解释程序整体失败。它不抛出任何的错误(例如SyntaxError)。展示如下
(function f(){})(); // <== 有名函数表达式 NFE
alert(1); //因为前面的表达式是的整个程序失败,本行将无法达到, this line is never reached, since previous expression fails the entire program
After fiddling with various test cases, I came to conclusion that Safari 2.x fails to parse named function expressions, if those are not part of assignment expressions. Some examples of assignment expressions are:
在用一些测试用例测试之后,我总结出,如果有名函数表达式不是赋值表达式的一部分,Safari解释有名函数表达式将失败。一些赋值表达式的例子如 下
// 变量声明part of variable declaration
var f = 1;
//简单的赋值 part of simple assignment
f = 2, g = 3;
// 返回语句part of return statement
(function(){
return (f = 2);
})();
This means that putting named function expression into an assignment makes Safari “happy”:
这就意味着把有名函数表达式放到赋值表达式中会让 Safari非常“开心”
(function f(){}); // fails 失败
var f = function f(){}; // works 成功
(function(){
return function f(){}; // fails 失败
})();
(function(){
return (f = function f(){}); // works 成功
})();
setTimeout(function f(){ }, 100); // fails
It also means that we can't use such common pattern as returning named function expression without an assignment:
这也意味着我们不能使用这种普通的模式而没有赋值表达式作为返回有名函数表达式
//要取代这种Safari2.x不兼容的情况 Instead of this non-Safari-2x-compatible syntax:
(function(){
if (featureTest) {
return function f(){};
}
return function f(){};
})();
// 我们应该使用这种稍微冗长的替代方法we should use this slightly more verbose alternative:
(function(){
var f;
if (featureTest) {
f = function f(){};
}
else {
f = function f(){};
}
return f;
})();
// 或者另外一种变形or another variation of it:
(function(){
var f;
if (featureTest) {
return (f = function f(){});
}
return (f = function f(){});
})();
/*
Unfortunately, by doing so, we introduce an extra reference to a function
which gets trapped in a closure of returning function. To prevent extra memory usage,
we can assign all named function expressions to one single variable.
不幸的是 这样做我们引入了对函数的另外一个引用
他将被包含在返回函数的闭包中
为了防止多于的内存使用,我们可以吧所有的有名函数表达式赋值给一个单独的变量
*/
var __temp;
(function(){
if (featureTest) {
return (__temp = function f(){});
}
return (__temp = function f(){});
})();
...
(function(){
if (featureTest2) {
return (__temp = function g(){});
}
return (__temp = function g(){});
})();
/*
Note that subsequent assignments destroy previous references,
preventing any excessive memory usage.
注释:后面的赋值销毁了前面的引用,防止任何过多的内存使用
*/
If Safari 2.x compatibility is important, we need to make sure “incompatible” constructs do not even appear in the source. This is of course quite irritating, but is definitely possible to achieve, especially when knowing the root of the problem.
如果Safari2.x的兼容性非常重要。我们需要保证不兼容的结构不再代码中出现。这当然是非常气人的,但是他确实明确的可以做到的,尤其是当我 们知道问题的根源。
It's also worth mentioning that declaring a function as NFE in Safari 2.x exhibits another minor glitch, where function representation does not contain function identifier:
还值得一提的是在Safari中声明一个函数是有名函数表达式的时候存在另外一个小的问题,这是函数表示法不含有函数标示符(估计是 toString的问题)
var f = function g(){};
// Notice how function representation is lacking `g` identifier
String(g); // function () { }
This is not really a big deal. As I have already mentioned before, function decompilation is something that should not be relied upon anyway.
这不是个很大的问题。因为之前我已经说过,函数反编译在任何情况下都是不可信赖的。
解决方案
var fn = (function(){
//声明一个变量,来给他赋值函数对象 declare a variable to assign function object to
var f;
// 条件的创建一个有名函数 conditionally create a named function
// 并把它的引用赋值给f and assign its reference to `f`
if (true) {
f = function F(){ }
}
else if (false) {
f = function F(){ }
}
else {
f = function F(){ }
}
//给一个和函数名相关的变量以null值 Assign `null` to a variable corresponding to a function name
//这可以使得函数对象(通过标示符的引用)可以被垃圾收集所得到This marks the function object (referred to by that identifier)
// available for garbage collection
var F = null;
//返回一个条件定义的函数 return a conditionally defined function
return f;
})();
Finally, here's how we would apply this “techinque” in real life, when writing something like a cross-browser addEvent function:
最后,当我么一个类似于跨浏览器addEvent函数的类似函数时,下面就是我们如何在真实的应用中使用这个技术
// 1) 用一个分离的作用域封装声明 enclose declaration with a separate scope
var addEvent = (function(){
var docEl = document.documentElement;
// 2)声明一个变量,用来赋值为函数 declare a variable to assign function to
var fn;
if (docEl.addEventListener) {
// 3) 确保给函数一个描述的标示符 make sure to give function a descriptive identifier
fn = function addEvent(element, eventName, callback) {
element.addEventListener(eventName, callback, false);
}
}
else if (docEl.attachEvent) {
fn = function addEvent(element, eventName, callback) {
element.attachEvent('on' + eventName, callback);
}
}
else {
fn = function addEvent(element, eventName, callback) {
element['on' + eventName] = callback;
}
}
// 4)清除通过JScript创建的addEvent函数 clean up `addEvent` function created by JScript
// 保证在赋值之前加上varmake sure to either prepend assignment with `var`,
// 或者在函数顶端声明 addEvent or declare `addEvent` at the top of the function
var addEvent = null;
// 5)最后通过fn返回函数的引用 finally return function referenced by `fn`
return fn;
})();
可替代的解决方案
It's worth mentioning that there actually exist alternative ways of
having descriptive names in call stacks. Ways that don't require one to
use named function expressions. First of all, it is often possible to
define function via declaration, rather than via expression. This option
is only viable when you don't need to create more than one function:
需要说明,实际上纯在一个种使得在调用栈上显示描述名称(函数名)的替代方法。一个不需要使用有名函数表达式的方法。首先,通常可以使用声明而不是
使用表达式来定义函数。这种选择通常只是适应于你不需要创建多个函数的情况。
var hasClassName = (function(){
// 定义一些私有变量define some private variables
var cache = { };
//使用函数定义 use function declaration
function hasClassName(element, className) {
var _className = '(?:^|\\s+)' + className + '(?:\\s+|$)';
var re = cache[_className] || (cache[_className] = new RegExp(_className));
return re.test(element.className);
}
// 返回函数return function
return hasClassName;
})();
This obviously wouldn't work when forking function definitions.
Nevertheless, there's an interesting pattern that I first seen used by
Tobie Langel. The way it works is by defining all functions
upfront using function declarations, but giving them slightly different
identifiers:
这种方法显然对于多路的函数定义不适用。但是,有一个有趣的方法,这个方法我第一次在看到Tobie
Langel.在使用。这个用函数声明定义所有的函数,但是给这个函数声明以稍微不同的标示符。
var addEvent = (function(){
var docEl = document.documentElement;
function addEventListener(){
/* ... */
}
function attachEvent(){
/* ... */
}
function addEventAsProperty(){
/* ... */
}
if (typeof docEl.addEventListener != 'undefined') {
return addEventListener;
}
elseif (typeof docEl.attachEvent != 'undefined') {
return attachEvent;
}
return addEventAsProperty;
})();
While it's an elegant approach, it has its own drawbacks. First, by
using different identifiers, you loose naming consistency. Whether it's
good or bad thing is not very clear. Some might prefer to have identical
names, while others wouldn't mind varying ones; after all, different
names can often “speak” about implementation used. For example, seeing
“attachEvent” in debugger, would let you know that it is an attachEvent-based implementation of addEvent. On the other hand,
implementation-related name might not be meaningful at all. If you're
providing an API and name “inner” functions in such way, the user of API
could easily get lost in all of these implementation details.
虽然这是一个比较优雅的方法,但是他也有自己的缺陷。首先,通过使用不同的标示符,你失去的命名的一致性。这是件好的事情还是件坏的事情还不好说。
有些人希望使用一支的命名,有些人则不会介意改变名字;毕竟,不同的名字通常代表不同的实现。例如,在调试器中看到“attachEvent”,你就可以
知道是addEvent基于attentEvent的一个实现。另外一方面,和实现相关的名字可能根本没有什意义。如果你提供一个api并用如此方法命名
内部的函数,api的使用者可能会被这些实现细节搞糊涂。
A solution to this problem might be to employ different naming
convention. Just be careful not to introduce extra verbosity. Some
alternatives that come to mind are:
解决这个问题的一个方法是使用不同的命名规则。但是注意不要饮用过多的冗余。下面列出了一些替代的命名方法
`addEvent`, `altAddEvent` and `fallbackAddEvent`
// or
`addEvent`, `addEvent2`, `addEvent3`
// or
`addEvent_addEventListener`, `addEvent_attachEvent`, `addEvent_asProperty`
Another minor issue with this pattern is increased memory
consumption. By defining all of the function variations upfront, you
implicitly create N-1 unused functions. As you can see, if attachEvent is found in document.documentElement,
then neither addEventListener nor addEventAsProperty are ever really used. Yet, they
already consume memory; memory which is never deallocated for the same
reason as with JScript's buggy named expressions - both functions are
“trapped” in a closure of returning one.
这种模式的另外一个问题就是增加了内存的开销。通过定义所有上面的函数变种,你隐含的创建了N-1个函数。你可以发现,如果attachEvent
在document.documentElement中发现,那么addEventListener和addEventAsProperty都没有被实际
用到。但是他们已经消耗的内存;和Jscript有名表达式bug的原因一样的内存没有被释放,在返回一个函数的同时,两个函数被‘trapped‘在闭
包中。
This increased consumption is of course hardly an issue. If a library
such as Prototype.js was to use this pattern, there would be not more
than 100-200 extra function objects created. As long as functions are
not created in such way repeatedly (at runtime) but only once (at load
time), you probably shouldn't worry about it.
这个递增的内存使用显然是个严重的问题。如果和Prototype.js类似的库需要使用这种模式,将有另外的100-200个多于的函数对象被创
建。如果函数没有被重复地(运行时)用这种方式创建,只是在加载时被创建一次,你可能就不用担心这个问题。