Talk:Scale space

	This article is within the scope of WikiProject Computer Vision, a collaborative effort to improve the coverage of Computer Vision on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer VisionWikipedia:WikiProject Computer VisionTemplate:WikiProject Computer VisionComputer Vision articles
Low	This article has been rated as Low-importance on the importance scale.

Robotics Low‑importance

	This article is within the scope of WikiProject Robotics, a collaborative effort to improve the coverage of Robotics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.RoboticsWikipedia:WikiProject RoboticsTemplate:WikiProject RoboticsRobotics articles
Low	This article has been rated as Low-importance on the project's importance scale.

untitled[edit]

I’m quite concerned about the rewrite of this article. It has changed the scope of ”scale-space” quite dramatically– from the way it is used by the computer vision community for 2-D and 3-D images today towards an emphasis on coarse-to-fine segmentation approaches for 1-D signals as it was proposed in the 1980’s and towards implementation aspects.

If implementation issues should be emphasized, I would suggest that this topic could be addressed in another article on ”scale-space implementation”. Moreover, if the notion of ”scale-space segmentation” and coarse-to-fine approaches should be addressed, these topics could also be developed in another subarticle. There have been quite a number of works in these areas, and I would say that the current article does not give a balanced view of the different approaches that have been investigated over the years.

The approach I took when rewriting this article was to give a brief overview of ”scale-space” as it is used by the computer vision community today, and to list some of the main references in the field for a new reader to get familiar with the topic. The previous selection of ”References” was organised in this way. Then, I also listed some main references in related areas and listed those under ”Related work”.

Tpl 09:58, 8 June 2006 (UTC)[reply]

Tpl, my main problem with the article, which may be a problem with the current concept in the vision community, to some extent, is the over-reliance on the formal derivation of the uniqueness of the Gaussian based on too many constraining properties. You ossify that problem if you insist on the formal Gaussian in this article and relegate everything else to "implementation" issues. A number of the studies you cite have proven things about the characteristic function (Fourier transform) or generating function (Z transform) of the impulse response of an acceptable smoothing function based on various sensible criteria, and the conditions I cited for Z-plane poles and zeros come out of that, with as much formal validity as the unique Gaussian, for an appropriate set of constraints. That, I think, belongs in the main article, to help break the habit of thinking that only the Gaussian is ideally suited for scale space. Also, what do you mean by "causality" with respect to Gaussian smoothing? Isn't a Gaussian infinite in both directions and therefore not causal? As to the scale-space segmentation article, that's probably a good idea, since it's primarily 1D while the rest is primarily 2D. Dicklyon 16:45, 8 June 2006 (UTC)[reply]

ps: You don't have a user page, but I assume you are Tony P. Lindeberg, and therefore much more knowledgable about this stuff than I am. I recently ordered a copy of your (expensive!) book so I can get caught up on the thinking in this community. I used to work with Witkin at SPAR, and too much of my thinking is probably too old as a result of not having followed all your good work since then (but some, anyway). Dicklyon 16:53, 8 June 2006 (UTC)[reply]

I think that your comments are useful. As you point out, there are several theories and approaches that are not mentioned in the current article. On the other hand, if one would develop these notions the presentation might become rather technical such that the overview is lost. To remedy your concern, I have included another article on multi-scale approaches intended to describe some of the other multi-scale approaches that prevalent in the field. So far, it only lists the special scale-space theory for one-dimensional signals, but there are other theories that could be included such as the scale invariant semi-groups studied by Pauwels.

Concerning your question about "causality", there is an unfortunate confusion about terminology in this respect. In Koenderinks derivation of the uniqueness of the Gaussian kernel in 2-D, he makes use of a property of level curves over scales which he refers to as "causality". This notion bears som relation to causality as it is used in signal processing, but has nothing to do with one-sided kernels that do not access the future.

Concerning properties of poles, please fill in these details in the article on multi-scale approaches if you feel that this description is useful to others. Please, let me know your opinions on this compromise.

Tpl 12:45, 9 June 2006 (UTC)[reply]

Now that you've got it all organized into articles, why don't we just combine those as sections? Dicklyon 17:59, 10 June 2006 (UTC)[reply]

Well, there is quite a difference in style and levels of details. The overall scale-space article is at a comprehensive overview level describing scale-space methods as they are used today in the computer vision community, while the scale-space implementation article is more technical, the scale-space axioms article is more mathematical, the scale-space segmentation article is more historical and the multi-scale approaches article is not complete in its present form. Within one article, I think that it is importance to keep the level of presentation reasonably consistent. To merge these five articles into one article that is consistent in terms of mathematical and algorithmic detail as well as references would require a substantial amount of work and would be imply writing a longer review article. Then, I'm afraid that the encyclopedia style would be lost.

Tpl 10:55, 11 June 2006 (UTC)[reply]

I don't see the problem. To me, your main article "comprehensive overview" is actually rather deep and narrow. I'd like to see more breadth in the main article, even if different sections are at different levels of sophistication; that's how good wikipedia articles usually evolve.

Anyone else have opinions here? Dicklyon 20:14, 11 June 2006 (UTC)[reply]

Regarding the related topic of wavelets the presentation is split into several articles (continuous wavelet transform, discrete wavelet transform, multiresolution analysis) as well as a large number of pages for various types of specific wavelets. If we follow a similar approach regarding scale-space there could be more articles than those who are written up yet. For several sentences in the current scale space article it would be possible to write an entire article that explains the notion in more detail, including mathematical definitions, algorithmic details and a richer set of refernces.

N-jets[edit]

This term N-jets is used without definition. Can someone please add an explanation of exactly what it is. Dicklyon 05:16, 26 August 2006 (UTC)[reply]

Now, there is a link from the first occurrence of the word N-jet to a new article on the topic. More could definitely be added, but it is at least a first outline. Tpl 14:07, 4 September 2006 (UTC)[reply]

Arguments against the suggested merge[edit]

Concerning the suggestion to merge the current scale-space article with the articles on scale-space axioms and scale-space implementation, I'm sorry to say that I'm strongly agains this suggestion. The level of presentation is substantially different in each one of these articles, while the individual articles are now self-consistently in a uniform style and level of presentation. The scale-space article is an overview article, while the articles on scale-space axioms and scale-space implementation are very detailed and technical. If one would try to merge these articles, the result would become extremely unbalanced. Please, also note that the current scale-space article refers to other articles concerning more technical contents within the area of scale-space, in particular the articles on blob detection, corner detection, ridge detection, edge detection, scale-space segmentation and affine shape adaptation. Tpl 08:42, 16 January 2007 (UTC)[reply]

I'm OK either way on this. It's not bad as it is. Tpl had split it up after I had added a lot of new content to the original one article, and I agree it works this way. Dicklyon 16:25, 16 January 2007 (UTC)[reply]

Shouldn't we have a snippet on scale-space axioms and scale-space implementation at least? I understand how these topics deserve an article of their own, but the scale space article should definitely have at least a "preview" of the other articles as the axioms (esp) are key to understanding the theory. Showing snippets of "subarticles" seems to be a standard wikipedia format.

Now, I have added five pointers or "snippets" of topics that are treated in more detail in other articles. Please, let me know about your reactions to this. Tpl 10:32, 31 January 2007 (UTC)[reply]

Intuition[edit]

As someone who is quite familiar with image processing, including spectral frequency decomposition of images, etc., I would like to offer the idea that this article does an excellent job at excluding anything that might give a reader who is new to the topic any intuitive understanding of the topic that might serve as a step towards an eventual more complete formal understanding. I imagine that a scale-space representation of an image is constructed by using blurs of increasing radius to capture or isolate features at different levels of granularity... but I have no idea form reading the article.

Some ideas and comments on the lead section[edit]

This is a relevant and interesting article but it still lacks the structure to allow someone not already in the field to get an idea of what it is. The main problem is the lead section (from start to the content table) which is way too long. It must be possible to introduce the concept in a few paragraphs and maybe some images and leave the deeper explainations to the remaining sections. Apart from that there are also some issues which comes to mind when I read the text:

Scale space theory is a framework (can we say mathematical framwork to be more precise?) for multi-scale signal representation developed by the computer vision, image processing and signal processing communities. It is a formal theory for handling image structures (why only images?) at different scales in such a way that fine-scale features can be successively (the word successively suggests that there is a time variable involved. Although this is sometimes correct in practice, it is not a formal characteristic of a scale space to think about the scale levels as being produced one after the other, is it?) suppressed and a scale parameter $t$ (again t suggest time here) can be associated with each level in the scale-space representation. The reader has now been told that there are levels in a scale space, but only in an implicit way. Can this be done more explicit?

The notion of scale-space is general and applies in arbitrary dimensions. For simplicity of presentation, however, we here describe the most commonly used case with two-dimensional images. It is not obvious for most readers why the two-dimensional case is the simpler case. We could say that 1D is simpler but 2D is more common, or I even suggest to present the 1D case SINCE IT IS SIMPLER! For a given image $f(x,y)$ , its linear (why do we need a linear here? Are there also non-linear scale spaces?) scale-space representation is a family of derived signals $L(x,y,t)$ defined by convolution of $f(x,y)$ with the Gaussian kernel

g(x,y,t)={\frac {1}{2{\pi }t}}e^{-(x^{2}+y^{2})/2t}.\,

such that

L(x,y,t)\ =g(x,y,t)*f(x,y).

where $t=\sigma ^{2}$ is the variance of the Gaussian. I believe that is it possible to talk about a non-Gaussian scale space, which will not satisfy all the scale space axioms. Are the axioms really defining the scale space concept or are they just providing specifications for a particularly useful class of scale spaces? Equivalently, the scale-space family can be generated from the solution of the heat equation,

\partial _{t}L={\frac {1}{2}}\nabla ^{2}L

,

with initial condition $L(x,y,0)=f(x,y)$ . This is interesting but does not have to be included in the lead section. Maybe a section Relation to the heat equation where this connection is developed? Several different derivations have been expressed showing that this is the canonical way to generate a linear scale-space, based on the essential requirement that new structures must not be created from a fine scale to any coarser scale. Conditions, referred to as scale-space axioms, that have been used for deriving the uniqueness of the Gaussian kernel include linearity, shift-invariance, semi-group structure, non-enhancement of local extrema, scale invariance and rotational invariance. Interesting, but can be moved from the lead to a separate section which explains the axioms

The motivation for generating a scale-space representation of a given data set (maybe use signal instead of data set?) originates from the basic fact (not all readers will accept this as a basic fact, maybe observation is better?) that real-world objects are composed of different structures at different scales. This implies that real-world objects, in contrast to idealized mathematical entities such as points or lines, may appear in different ways depending on the scale of observation. For example, the concept of a "tree" is appropriate at the scale of meters, while concepts such as leaves and molecules are more appropriate at finer scales. For a machine vision system analysing an unknown scene, there is no way to know a priori what scales are appropriate for describing the data. data refers here, I believe, either to the image itself or to some useful information which the image carries about something. Is it possible to be more precise? Hence, the only (only is a strong word here?) reasonable approach is to consider descriptions at all scales simultaneously. It may help to sometimes use the world size which perhaps is more familiar to most readers instead of scale?

The rest of the lead section is rather technical and should be edited into one or a few separate sections with explaining headings. I would also like to see some images which illustrate a scale space. I could try to do some myself, but I believe that some of the previous editors of the article must have some nice images in their drawers?

Minor detail: the article spells it consistently scale-space except in the main title and the first occurence. This should be corrected or are there some reasons not to?

Major detail: It appears to me that the scale concept is explored in a very similar (but non-Gaussian) way in wavelet-theory and multiresolution analysis. Is it worth a comment in this article?

--KYN 23:34, 17 August 2007 (UTC)[reply]

Yes, there are lots of multi-scale things worth referring to briefly and linking, if they're not there already. Dicklyon 00:09, 18 August 2007 (UTC)[reply]

Thanks for opening these issues for discussion and for pointing out a number of things that can be improved. I have written major parts of this text and I'll try to give my opinions. Concerning figures, I do have some in my drawers. Due to copyright reasons, however, I would need to make new ones in order to put them in Wikipedia. Concerning the relations to wavelet theory and multi-resolution analysis, there is a sentence about this towards the end of the current article. Below you find my response to your detailed comments (my reply to your comments in red are in brown ):

Scale space theory is a framework (can we say mathematical framwork to be more precise?) (I would say that scale-space is more than just a mathematical framwork. The notions of scale-space have been very much influenced by physics and computer science. In physics it is well-known that any measurement requires a non-infinitesimal aperture; such non-infinitesimal apertures is one of the corner stones of scale-space. The motivations have also orginated from the subfield of computer science known as computer vision, which aims at constructing vision systems that interpret images automatically, not to mention the close relations to biological vision. From these perspectives, I would find a reformulation into mathematical framwork as partly misleading.) for multi-scale signal representation developed by the computer vision, image processing and signal processing communities. It is a formal theory for handling image structures (why only images?) (You are right in the respect that scale-space can also be used to other types of signals. The dominant area of applications today is however concerning images, and therefore I find it more appropriate to use "image" here.) at different scales in such a way that fine-scale features can be successively (the word successively suggests that there is a time variable involved. Although this is sometimes correct in practice, it is not a formal characteristic of a scale space to think about the scale levels as being produced one after the other, is it?) (Actually, it is. An essential requirement of a scale-space is that any coarse-scale representation should be a simplification of any finer-scale representation. This property is very closely related to the semi-group property and distinguishes scale-space representation from other types of multi-scale representations that do not satisfy this property.) suppressed and a scale parameter $t$ (again t suggest time here) Yes, historically the relation to the diffusion equation was observed in the very first works of scale-space, and in the scale-space literature "t" is used for the variance of the Gaussian kernel. This is an established terminology in the scale-space literature. can be associated with each level in the scale-space representation. The reader has now been told that there are levels in a scale space, but only in an implicit way. Can this be done more explicit? Presumably, yes. Some early figures would definitely help.

The notion of scale-space is general and applies in arbitrary dimensions. For simplicity of presentation, however, we here describe the most commonly used case with two-dimensional images. It is not obvious for most readers why the two-dimensional case is the simpler case. We could say that 1D is simpler but 2D is more common, or I even suggest to present the 1D case SINCE IT IS SIMPLER! You are right that the 1-D case is simpler. The 1-D case, however, has a number of special properties that do not carry to higher dimensions, and historically this has mislead a number of researchers to generalize properties that cannot be generalized. The 2-D case is cleary the most common case in the literature. The 2-D case also has much fewer special properties that do not carry to 3-D or higher dimensions. Hence, I would say that the 2-D case is the GENERIC case. For a given image $f(x,y)$ , its linear (why do we need a linear here? Are there also non-linear scale spaces?) (Yes, a multitude of non-linear scale-spaces have been proposed by various researchers. A few of these have rather well-defined theoretical foundations, while others could be regarded as more ad hoc. The linear scale-space however has served as the generic model for various non-linear generalizations. Towards the end of this article, there is a sentence about this with references to the two monographs that so far have been written on non-linear scale-spaces. I agree that it could be interesting for Wikipedia to describe non-linear scale-spaces. Since non-linear scale-space are more complex and in addition usually not encompass a similar nice and simple theoretical structure as the linear scale-space , I would find it more appropriate to present these in a separate article on non-linear scale-spaces. Please, complement on this if you can.) scale-space representation is a family of derived signals $L(x,y,t)$ defined by convolution of $f(x,y)$ with the Gaussian kernel

g(x,y,t)={\frac {1}{2{\pi }t}}e^{-(x^{2}+y^{2})/2t}.\,

such that

L(x,y,t)\ =g(x,y,t)*f(x,y).

where $t=\sigma ^{2}$ is the variance of the Gaussian. I believe that is it possible to talk about a non-Gaussian scale space, which will not satisfy all the scale space axioms. Are the axioms really defining the scale space concept or are they just providing specifications for a particularly useful class of scale spaces? Yes, you are right that there are also other "scale-spaces" that have been proposed, while these do not satisfy the same theoretical properties. In Wikipedia, I have made a first step towards describing these in the article on multi-scale approaches. This article is however not yet as developed. Equivalently, the scale-space family can be generated from the solution of the heat equation,

\partial _{t}L={\frac {1}{2}}\nabla ^{2}L

,

with initial condition $L(x,y,0)=f(x,y)$ . This is interesting but does not have to be included in the lead section. Maybe a section Relation to the heat equation where this connection is developed? The current article is not organised into a lead section followed by subsections. Instead, it is written as one to two page article without headers. The reason why the heat equation is emphasized is that theoretical analysis in many cases is simpler when done in terms of the diffusion equation. In fact, several of the theoretical derivations of linear scale-space are done in terms of the diffusion equation as the primary tool. With regard to non-linear generalizations, many of these are also n terms of non-linear generalizations of the diffusion equations. Hence, for those working with scale-space, the diffusion equation is really equally important. Several different derivations have been expressed showing that this is the canonical way to generate a linear scale-space, based on the essential requirement that new structures must not be created from a fine scale to any coarser scale. Conditions, referred to as scale-space axioms, that have been used for deriving the uniqueness of the Gaussian kernel include linearity, shift-invariance, semi-group structure, non-enhancement of local extrema, scale invariance and rotational invariance. Interesting, but can be moved from the lead to a separate section which explains the axioms These theoretical properties are essential and distinguish scale-space from just "any type of smoothing method". Hence, I find it important to early emphasize that scale-space is much more than a mere ad hoc smoothing method.

The motivation for generating a scale-space representation of a given data set (maybe use signal instead of data set?) (Here "data set" was taken as a variation of "image", to vary the language. "Signal" may however have other co-notations (see above)) originates from the basic fact (not all readers will accept this as a basic fact, maybe observation is better?) Maybe? that real-world objects are composed of different structures at different scales. This implies that real-world objects, in contrast to idealized mathematical entities such as points or lines, may appear in different ways depending on the scale of observation. For example, the concept of a "tree" is appropriate at the scale of meters, while concepts such as leaves and molecules are more appropriate at finer scales. For a machine vision system analysing an unknown scene, there is no way to know a priori what scales are appropriate for describing the data. data refers here, I believe, either to the image itself or to some useful information which the image carries about something. Is it possible to be more precise? Yes, "structures in the data" would be better. I'll change that. Hence, the only (only is a strong word here?) reasonable approach is to consider descriptions at all scales simultaneously. It may help to sometimes use the world size which perhaps is more familiar to most readers instead of scale? This formulation may be far to brief. A better way of explaining this would be something like: "Hence, in the absense of prior information about what scales are relevant, the natural way to achieve robustness to unknown size or scale transformations is by representing the data at multiple scales. Furthermore, if we do not want to restrict ourselves to a pre-determined sparse set of scale levels, the ideal generalization of this is by considering representations at all scales simultaneously. This is one of the major ideas of scale-space."

Please, let me know about your reactions to this response (which I hope that you don't misunderstand). Your comments seem to indicate that there are a number of things in the article that can be explained better. Maybe the article is just too brief in relation to the material it describes. Originally, it was written as an extension of earlier descriptions that were much shorter and much less developed. Tpl 11:28, 24 August 2007 (UTC)[reply]

More comments[edit]

The reason for my concern about this article (not just the lead) is that a) scale-space is an important concept within (e.g.) computer vision and b) thanks to User:Tpl and other there is now a set of scale-space related articles in WP. For these reasons it makes sense to have at least one introductory article to this field which is relatively simple to understand. Perhaps not for any reader who happens to find it but at least for an undergraduate EE student. That article appears to be this one.

Since there was no objection, I take it that it is OK to move the article to new title "Scale-space" (with a hyphen)

I object. The hyphen is only appropriate when "scale-space" is used as an adjective. The notion of scale space as a noun does not need a hyphen. Basic English grammar. Dicklyon 05:01, 28 August 2007 (UTC)[reply]

For the reasons given above I suggest to bring the article to a state which better follows, for example, WP:BETTER even though it is not obvious how to accomplish that. My general conclusion is that a) the lead needs to be shorter and simpler, b) some of the stuff now in the lead must be worked into separate headings. I dare not do this myself given the hard work which has already been done by others who are far more experienced in the field, but I can contribute by pointing out some of the difficulties in reading the text for a non-expert. I focus on the lead and hope that some of the other issues maybe solve themselves in that process.

According to WP:BETTER the first sentence of the lead should "give the shortest possible relevant characterization of the subject". It does not say explicitly that it also must make sense to someone not already familiar with the subject, but this is perhaps the very idea of an encyclopedia, and this is where I am still concerned. For example, "multi-scale signal representation" means that the reader must already have an idea of what both "multi-scale" and "signal representation" could mean. Here is a suggestion for a new lead:

Start lead

Scale-space is a theory for describing signals in different levels of resolution, developed by the computer vision, image processing and signal processing communities, to some extent based also on physics and biological vision. It represents the signal, typically an image, as a set of (scale) levels where each level is characterized by having a specific resolution relative the original signal. The resolution, often referred to simply as scale describes how fine details can be expected at that level. The highest scale corresponds to the original image, and moving successively to lower scales removes more and more of the details. A central idea is that there is a continuum of scale levels, even though in practice the scale levels are only computed for a finite set of scale values. For a particular signal, its scale-space refers to the set of scale levels produced by some specific scale-space theory.

In general, scale-spaces can be formulated for signals of arbitrary dimension, using linear or non-linear computations. The most common applications, however, are for 2D images and by far the most common approach for computing the scale levels $L(x,y,t)$ is based on convolution of the original image f by a Gaussian function g:

L(x,y,t)\ =g(x,y,t)*f(x,y)

where

g(x,y,t)={\frac {1}{2{\pi }t}}e^{-(x^{2}+y^{2})/2t}.\,

and $t=\sigma ^{2}$ is the variance of the Gaussian and also the index of the corresponding scale level. There are 2 formal issues related to the convolution expression which need to be sorted out: 1) what does it mean to convolve a 3 variable function with a 2 variable function, 2) in this particular formulation it is perfectly possible to be strict and not put the variables inside the convolution operation. For 1) is it possible to define both L and g as 2D functions with an additional index; e.g., $L_{t}(x,y)$ ? This would also help solving issue 2)

This type of scale-space is interesting since it is the only one to satisfy a set of useful properties commonly referred to as the scale space axioms. Furthermore, the set of 2D functions $L(x,y,t)$ , indexed by the scale parameter t, can be shown to be the solution of the heat equation.

Stop lead (missing wikilinks need to be inserted)

The rest of the article can have sections with headings such as

Motivation/background (in relation to biological vision / physics)
Applications in computer vision and image processing (move subheadings 2-4 in Further readings to here + stuff now in the lead)
The scale space axioms (short overview with link to the main article)
Implementation issues (perhaps expand subheading 5 in Further reading and link it to the main article)

In my view, most of the stuff is there already, we only need to reshuffle a bit and be focused in the lead section, then this could be a great article. --KYN 22:57, 24 August 2007 (UTC)[reply]

I think your ideas are pretty much OK, and if you want to start, then I bet tpl and I will be able to help. We might not leave the lead exactly as you propose, but I agree something like that would be helpful. Dicklyon 05:01, 28 August 2007 (UTC)[reply]

If the article is to be reorganized into a brief lead followed by a number of labelled sections, it may require a major rework to fit well with such an organization. Nevertheless, I could make a first draft according these ideas. An important aspect concerns what background should be assumed by the reader. You are right that the article could be done more introductory for a newcomer. However, some background knowledge must definitely be assumed, such that the article does not lose the focus or becomes too plain for an experienced reader. (Today, the article is written such that computer vision researchers not familiar with scale-space can benefit substantially from reading this article.) A style that one could follow is to use the appropriate terminology throughout, while also explaining the meaning of essential terms in a laymans vocabulary. The use of complementary figures will also help substantially. (The copyright requirements of Wikipedia are however too tough for figures that have been previously published elsewhere, I would have to generate any figure that I upload.) If you agree on this, I could make a first outline according to these ideas, trying to be more tutorial in style. Concerning time planning, however, I'm a bit occupied at the moment to get a project up and running after the summer; so I expect I could need a week or two. If you want to look up other tutorials on scale-space meanwhile, please see the external references from 1996 and 1994 at the bottom of the scale-space page. In these tutorials, much more introductory material is given, such that the material can be reasonably well understood by a newcomer. Tpl 06:48, 28 August 2007 (UTC)[reply]

P.S. Concerning the convolution operation, there was originally a semicolon separating the variables (x, y) from the parameter t, using the established notation L(x, y; t) = g(x, y; t) * f(x, y) in the scale-space literature. Someone, however, has apparently changed that, without understanding the implications. This is one of the drawbacks of Wikipedia ...

Indeed, having a compact notation for this is a bit tricky. In the scale-space literature, subscripts are usually used for denoting derivatives. Hence, a subscript for t would be misleading. Actually, also the current notation is too simplified, since one should not use variables in a convolution operator. Rather, one should use dots instead. In order to not confuse readers unfamiliar to such a notation I avoided that, hoping this notation should be a reasonably comprise between condensedness and ease of understanding. Tpl 06:48, 28 August 2007 (UTC)[reply]

Plan for modifications[edit]

Given the above discussion on how to make the article better, I made a picture of how such an article could look like, at least in terms of structure and separation into sections, perhaps not the detailed content. It can be found here. It also includes some comments and questions which need to be taken care of.

There has just been a recent rewrite by Terjen, which seems to go in sort of the right direction, dividing the article into sections, even though many of the issues discussed above and questions in "my picture" are not addressed. Apparently, Tpl is also making a draft for a new shape of the article.

I will wait a bit for comments from other editors before doing any modifications on the current version. There's seems to be little idea in making incremental changes until there is an agreement on the direction to go based on an overall structure that seems reasonable. --KYN 10:02, 3 September 2007 (UTC)[reply]

Following the comments above, I have now made a first version of a major restructuring of the scale-space article. The intentions I have tried to follow are (i) to be much more tutorial and explain notation, interpretation and consequences of important parts of the scale-space theory more in laymans terms and (ii) organize the article as a longer Wikipedia article with different subsections that are relatively self-contained and permitting a certain overlap with some of the other related Wikipedia articles in order to make it easier for a newcomer to grasp the message. Possible disadvantages of this restructuring is that the condensedness of the previous version may have been lost and that text may possibly be regarded as too verbose for certain readers that already have have good knowledge in this field or related fields. Nevertheless, I hope that the article in its revised form should be more useful to a wider readership.

In the rewrite, I have tried to take the questions previously raised by KYN and DickLyon into account. Please, let me know if you see that something is still missing or too hard to grasp. Figures are still to be included, but I have added remarks concerning at least two figures that ought to be there. Tpl 18:23, 4 September 2007 (UTC)[reply]

Images[edit]

Here are some examples of images which can be used for the article. I have set up a small Matlab script for doing the computatitons which is relatively easy to change. It uses a sampled Gaussian with two different variances. If someone has suggestions for improvements, I can try to fix them. --KYN 12:13, 3 September 2007 (UTC)[reply]

I'd say start with a more interesting image with some natural grayscale stuff (maybe also the text), and use more scales, at least 0, .5, 1, 2, 4, 8, 16. Maybe animate the results. Dicklyon 16:38, 3 September 2007 (UTC)[reply]

Yes, your figures are good, but I also agree that there would be more illustrative to use grey-scale images that show more of the merits of scale-space. When I've written about scale-space previously, I've often combined a few levels in scale-space with the output of a feature detector, in order to make the effects more visible in the respect of how they influence later stage visual processes. Tpl 07:25, 4 September 2007 (UTC)[reply]

There's too much jump from scale 0 to 1, and not enough from 2 to 3. I recommend 0.5, 1, 2, 4. Dicklyon 14:22, 12 September 2007 (UTC)[reply]

I agree, using a linear scale on t has that effect, and quadratic or exponential is probably better. Will try to fix it soon... --KYN 18:48, 12 September 2007 (UTC)[reply]

One of us has a bug.

When I blur with t=1 using my own matlab code, it's not nearly as blurry as yours. Can I review your code for you? I'll send mine if you want it. Dicklyon 18:54, 12 September 2007 (UTC)[reply]

Your code is probably fine, but my t isn't calibrated to a "pixel unit", which I guess that your t is, rather to some arbitrary scale unit. --KYN 21:25, 12 September 2007 (UTC)[reply]

Well, if we're going to illustrate what t does, we ought to be calibrated. How about I filter your starting images at a few scales for you? Dicklyon 22:46, 12 September 2007 (UTC)[reply]

The current set of images have t calibrated in unit pixel^2. --KYN 05:38, 13 September 2007 (UTC)[reply]

Scale-size-resolution[edit]

Thanks Tpl for reworking the article into a more readable form. It now has a reasonable division into sections and some of the rough edges have been smoothed. As usual, I'm still a bit worried that the lead section is a bit too detailed and advanced to allow a non-expert to see what is going on in the rest of the article. Some ideas for improvements both here and in the remaining parts:

The article uses scale in a sort of self-referenced way. Although there are a couple of examples of what it could mean, I'm suggesting to use alternative and similar words rather than always using scale, for example, resolution or size. I agree that they are not always directly synonymous, but should work in some cases. It may even be reasonable to discuss what is the difference (if any) between these concepts. The scale article does not provide a good explanation about what scale means, it appears to be a stub.

In physics, there is a rather well established notion of scale. When writing the scale-space article, I have assumed this background context, which is also established in some areas of applied mathematics and engineering. If you take it to the extreme, I could in some respect accept that this notion is not always 100 % well-defined. If you want to look at it from such a viewpoint, however, I could say that scale-space representation provides an operational definition of scale, in terms of the scale parameter in scale-space and the types of image features that will be explicit in the scale-space representation at a certain scale.

Although the term resolution has also been used in some early scale-space literature, I find that terminology somewhat confusing. First of all, fine-scale structures correspond to a high resolution. Hence, the meaning of resolution would therefore rather be inverse scale. Secondly, resolution is in many cases used for denoting the sampling density of the image, with an implicit coupling to scale assuming a proper sampling process.

Concering the terminology size, such a terminology may also be confusing. For example, if you look at different ways of implementing a smoothing filter, you can consider masks of different The lead (1st paragraph) uses the term layer but does not describe what it means. Scale level appears first time in section Automatic scale selection and scale invariant feature detection but still without a proper explanation. The scale levels are simply called derived signals L in section Definition.

Hopefully a few figures could remedy possible confusions. I'll also try to think of ways of being more explicit. Being more tutorial is much easier in a survey article or a book, where the text on the topic would be much longer and with many more examples.

The uninitiated reader is still without a clue about what is going on with the convolution between a 3-variable and 2-variable function in section Definition. The semi-colon convention is, as far as I know, not a standard outside the scale-space literature. The article should be introducing the reader to this field and cannot rely on them already knowing about this convention.

Sorry, noticed that there is now some sort of explanation for the semicolon. Still, I wonder why it is necessary to have it (the semicolon convention) here in the article. Is it just for being compatible with some of the literature? --KYN 22:56, 8 September 2007 (UTC)[reply]

What would you prefer? A subscript t would still need an explanation, wouldn't it? Dicklyon 23:13, 8 September 2007 (UTC)[reply]

The semi-colon notation between variables and parameters is well accepted in some areas of applied mathematics, from which it has been borrowed. Maybe one should write out the convolution operation explicitly as an integral. This will however generate more complex notation that may scare some readers.

A subscript t is however not acceptable. It would not be consistent with the established literature, and it would imply a clash with the notation for derivatives. Tpl 17:38, 10 September 2007 (UTC)[reply]

What about applications outside compute vision? Is the Mipmapping in computer graphics worth spending one or two sentences on? 1D-applications?

The first three issues are address in the "picture" I made here. I'm not suggesting that the formulations used there necessary are the best solution to these issues, but they give an idea of what I'm talking about. --KYN 22:46, 8 September 2007 (UTC)[reply]

I agree that picture can help. Have you considered the improvements that we suggested? Don't worry about the feature detection for now; we can compute that on your images later for a separate set of images. I'll take a look at what easy text improvements are possible. Dicklyon 22:52, 8 September 2007 (UTC)[reply]

Yes, I will try to fix a new set of images as soon as I have found a good "natural image". Any suggestion? --KYN 22:59, 8 September 2007 (UTC)[reply]

Start with any photo you own, add some text if you like, etc. I made a few minor edits; let me know whether you think they helped. I took out the notes about where figures might be useful. Dicklyon 23:12, 8 September 2007 (UTC)[reply]

If you give me more time, I could possibly try to generate a few images. Currently, however, I have to generate new illustrations for another purpose, and I have to give that other project priority. Tpl 17:38, 10 September 2007 (UTC)[reply]

I'm sorry to say this, but after the recent changes of the article about a week ago, there are a number of statements that are quite unfortunate:

'In principle, any filter g of low-pass type and with a parameter t which determines its width may be used to generate a scale-space by convolution with f. However, the Gaussian scale-space, defined above, is the most commonly used both in practice and in the literature, and for good reasons.'

This is not correct. It is of crucial importance that the smoothing filter used for generating the scale-space does not introduce new spurious structures at coarse scales that do not correspond to simplifications of at finer scales. I could recommend the article Scale-space for discrete signals from 1990 (IEEE-PAMI) for an overview as well as a detailed treatment of this issue, which also applies to continuous signals.

This property is clear to me, what is (or has not been) is that is it also a strict defining property of a scale-space. Are you saying that if we change the gaussian filter to, let's say 1/(1+r^2), then the resulting L cannot be referred to a scale-space since it doesn't satisfy the above mentioned axiom? If not, what should we call this L? --KYN 21:22, 24 September 2007 (UTC)[reply]

For one-parameter families generated by certain smoothing filters that do not satisfy reasonable scale-space axioms while still generating smoothed images with a multi-scale character, the terminology 'multi-scale representation' may be appropriate. There are however also filters with unexpected side effects for which not even intuitive requirements of a multi-scale representation are satisfied. For examples of this, consider for example the box filter considered in the abovementioned PAMI article from 1990, or some other filter with a non-unimodal Fourier transform. Tpl 16:46, 25 September 2007 (UTC)[reply]

The article mentions generalizations of the scale-space idea, and I take it that some of these do not satisfy (all of) the axioms, perhaps not even the non-introduction of structures at coarser scales. If you are strict on this point, how can we explain these generalizations? --KYN 21:22, 24 September 2007 (UTC)[reply]

The affine scale-space mentioned a few times in the article basically satisfies similar scale-space properties as the Gaussian scale-space, except for rotational symmetry. There are also a few non-linear scale-spaces that obey for example non-enhancement of local extrema and for which 'scale-space' is an appropriate terminology in the non-linear case. There are however also other non-linear evolution schemes with awkard side effects that hardly mimic any of the nice properties of the linear Gaussian scale-space and for which the terminology 'scale-space' would be highly misleading. Tpl 16:46, 25 September 2007 (UTC)[reply]

If you are serious about excluding any non-gaussian g then this should be mentioned already in the lead which now only says "low-pass" which has a much more general interpretation. --KYN 21:22, 24 September 2007 (UTC)[reply]

Now, I have modified the formulation in the lead about this. Tpl 16:46, 25 September 2007 (UTC)[reply]

'Although this connection (to the diffusion equation or heat equation) appears superficial from the outset, the analogy with the diffusion process is relevant for formulating non-linear scale-spaces, for example, using anisotropic diffusion.'

In fact, the main scale-space formulation in terms of non-enhancement of local extrema (causality) is expressed in terms of a differential geometric condition on the partial differential equation that scale-space satisfies. The previous formulation with 'equivalently' had a meaning much closer to the truth.

This connection to the "non-enhancement of local extrema" axiom was not obvious to me from the earlier formulations, but your newer edits make it more clear. It could mean that the connection to physics (in terms of heat diffusion) should be reduced and instead the general aspects of the diffusion equation (without a specific physical interpretation) is emphasized and perhaps even trying to explain how the "non-enhancement of local extrema" is encoded in the equation? This way, the non-expert doesn't have to wonder what heat has to do with the story. --KYN 21:22, 24 September 2007 (UTC)[reply]

The analogy with the diffusion equation is fine, since it shows the similarity with a well-behaved physical process. A problem of developing non-enhancement of local extrema in the current article starting from a convolution formulation, however, is that certain regularity assumptions are needed to guarantee differentiability, which is needed to formulate the requirement of non-enhancement of local extrema. In the abovementioned PAMI 1990 article as well as the article 'On the axiomatic foundations of linear scale-space', this formulation is developed with the necessary mathematical background. Tpl 16:46, 25 September 2007 (UTC)[reply]

Now, I have rewritten these statements to be in agreement with the scale-space literature. I have also changed the previous wording 'scale-space levels' to the established terminology 'scale-space representation' (The established terminology is to use 'scale-space representation' for the 2+1-D scale-space generated by the diffusion equation or equivalently by Gaussian smoothing. The terminology 'scale-space level' may then be used for a specific 'slice' of the scale-space representation. Tpl 14:24, 24 September 2007 (UTC)[reply]

What it perhaps missing right now, is a clear definition of level or scale-space level as one of these slices. At best, this can be understood from the context, but I suggest that a proper definition is provided since this concept is used extensively throughout the article. --KYN 21:22, 24 September 2007 (UTC)[reply]

I hope that this should be understood from the context. Maybe others could give a third opinion on this. Tpl 16:46, 25 September 2007 (UTC)[reply]

The discussion continues[edit]

As mentioned above I would like to see a clear definition of scale-space level, which is used at several points in the article, instead of relying on context. I suggest to use the first paragraph of the Scale space#Defintion section:

The notion of scale-space applies to signals of arbitrary numbers of variables. The most common case in the literature applies to two-dimensional images, which is what is presented here. For a given image

f(x,y)

, its linear (Gaussian) scale-space representation is a family of derived signals

L(x,y;t)

, the scale-space representation, defined by the convolution of

f(x,y)

with the Gaussian kernel

The second sentence seems to introduce the concept scale-space representation twice, and instead it can introduce/define both concepts:

The notion of scale-space applies to signals of arbitrary numbers of variables. The most common case in the literature applies to two-dimensional images, which is what is presented here. For a given image

f(x,y)

, its linear (Gaussian) scale-space representation is a family of derived signals

L(x,y;t)

, the scale-space levels, defined by the convolution of

f(x,y)

with the Gaussian kernel

Yes/no? --KYN 20:19, 9 October 2007 (UTC)[reply]

Yes. While I agree with Tony that it's clear in context, it never hurts to have a more explicit definition; try one, and we'll find out if we all understand it the same way or not. Dicklyon 02:03, 10 October 2007 (UTC)[reply]

The previous sentence is obviously the result of a typing error in a series of successive rewritings. The following formulation is better:

The notion of scale-space applies to signals of arbitrary numbers of variables. The most common case in the literature applies to two-dimensional images, which is what is presented here. For a given image

f(x,y)

, its linear (Gaussian) scale-space representation is a family of derived signals

L(x,y;t)

defined by the convolution of

f(x,y)

with the Gaussian kernel

Please, also note that in the literature scale-space representation is used both for the 3-D data volume as well as 2-D slices if complemented by a wording of e.g. the following type the following figure shows the scale-space representation at scale t=16. In a revised version of the article, I have tried to remove the occurences of 'scale-space level' and replace them with more established terminology. I hope that this eliminates the problem. Tpl 09:12, 10 October 2007 (UTC)[reply]

Next issue to deal with is the assumption that any reader must accept the axioms as a basis for doing signal processing in general and image processing in particular. They are well motivated, but we cannot say that they are compulsory for any and all application. For example, in subsection "Why a Gaussian filter?": what we can say is that under the assumption that the smoothing filter should not introduce new spurious structures at coarse scales that do not correspond to simplifications of corresponding structures at finer scales the Gaussian filter is the only choice.

In relation to that, the following two paragraphs appear to present two different (but related?) motivations for the Gaussian filter. Paragraph 2 seems to repeat a full sentence which is already found in paragraph 1, can they be merged? What exactly does Equivalently in paragraph 3 refer to? --KYN 17:25, 15 October 2007 (UTC)[reply]

Nothing in the article says the have to all be accepted. There is discussion in the implementation article about relaxing some of them. But this article is to represent what the field of "scale space" is about, and the axioms and Gaussian have been part of that from its earliest days, and are in all the sources. "Equivalenty" means another way to get to the same place; you could try saying that some other way. Dicklyon 05:37, 16 October 2007 (UTC)[reply]

Concerning the two related sentences that you point to, these are the result of the successive rewriting of the article that has been going on lately. The second sentence is from the original text, while the first sentence is a reaction to previous text in the paragraph. Tpl 15:02, 17 October 2007 (UTC)[reply]

Concerning the interpretation of equivalently, the following mathematical results constitute the theoretical background:

1. If you start with a family of functions generated by convolution with Gaussian kernels, it is straightforward to show that this family of functions satisfies the diffusion equation. Hence, Gaussian smoothing implies the linear and isotropic diffusion equation.

2. If you start with the diffusion equation and a certain initial condition and ask the question of which functions satisfies the diffusion equation, it can be shown that this solution is obtained by convolving the original function by Gaussian kernels. Hence, the linear and isotropic diffusion equation implies Gaussian smoothing.

3. The abovementioned results show that Gaussian smoothing and the linear diffusion equation are mathematically equivalent ways of describing the same set of functions.

4. In the scale-space literature, different theoretical formulations have been expressed starting either from a convolution formulation with some family of initally unknown functions or differential geometric study of an intially undetermined one-parameter family of functions with some specific conditions raised on them. In even more concrete terms, some of the studies have started from a convolution formulation while others have started from a partial differential equation, at the end arriving at the same result.

5. The statement in the article about equivalent reflects that fact that these two ways of looking at the same problem really mean the same, i.e. they are identical, similar ways of expressing the same fact. Tpl 15:02, 17 October 2007 (UTC)[reply]

Psychological vision[edit]

This this term that an anon keeps adding make sense to anyone? Not to Google book search; see [1] and [2]. Dicklyon 15:11, 19 October 2007 (UTC)[reply]

Thanks for fixing this. I cannot make sense of this notion either. I can say that there have been a few (very few) papers on computational explanations of a few psychophysical experiments with scale-space modelling. For an encyclopedia, as Wikipedia, is however the previous formulation with biological vision is much better. (Moreover, this formulation also encompasses the references I have in mind). Tpl 08:15, 20 October 2007 (UTC)[reply]

There seems to be an error[edit]

Cite:

g_{t}(x,y)={\frac {1}{2{\pi }t}}e^{-(x^{2}+y^{2})/2t}\,

.... This definition of L works for a continuum of scales t >= 0, but typically only a finite discrete set of levels in the scale-space representation would be actually considered.

t is the variance of the Gaussian filter and for t = 0 the resulting filter g becomes an impulse function such that L(x,y;0) = f(x,y), that is, the scale-space representation at scale level t = 0 is the image f itself. If you put t = 0 into g you'll get: $g_{t}(x,y)={\frac {1}{2{\pi }t}}e^{-(x^{2}+y^{2})/2\cdot 0}=BANG\,$

--85.181.131.211 (talk) 14:50, 7 March 2010 (UTC)[reply]

The statement in the text follows from

\lim _{t\rightarrow 0}g_{t}(x,y)=\delta (x,y)=\delta (x)\delta (y)

, i.e., when t approaches 0, then

g_{t}

approaches the 2D impulse function. Since the result of convolving f(x,y) with that impulse funtions is again f(x,y) the statement in the text follws. I tried to clarify this in the text. --KYN (talk) 15:45, 7 March 2010 (UTC)[reply]

Notation concerning the convolution operation[edit]

In a number of previous edits, the notation for convolution had been changed so that it no longer was correct. Please, note that from a mathematical viewpoint, one cannot use notation of the form

L_{x^{m}y^{n}}(x,y;t)=\partial _{x^{m}y^{n}}g(x,y;\;t)*f(x,y)

since the convolution operation operates on functions $f$ and $g$ and not on function values in a single specific point $f(x,y)$ and $g(x,y;/;t)$ . For this reason, one can usually writes the convolution operation as

L_{x^{m}y^{n}}(\cdot ,\cdot ;\;t)=\partial _{x^{m}y^{n}}g(\cdot ,\cdot ;\;t)*f(\cdot ,\cdot )

where the dots $\cdot$ are to interpreted as place holders for the arguments over which the convolution is performed. In particular, a notation of the form

L_{x^{m}y^{n}}(x,y;t)=\partial _{x^{m}y^{n}}g_{t}(\cdot ,\cdot )*f(x,y)

that was used previously, does formally speaking not make any sense at all (although a trained person might be able to convert it to something meaningful). Tpl (talk) 07:12, 28 September 2010 (UTC)[reply]

Scale-space or scale space?[edit]

I notice that most of the links in the article use "scale-space", but the article name is "scale space" (no hyphen). The article should probably be consistent about this. 70.247.173.254 (talk) 23:27, 22 July 2012 (UTC)[reply]

We should follow normal hyphenation rules: hyphenate when the compound is used as an adjective, and not otherwise. If it's not so, fix it. Dicklyon (talk) 00:12, 23 July 2012 (UTC)[reply]

Sure enough, someone went hyphen-happy. I fixed. Dicklyon (talk) 00:20, 23 July 2012 (UTC)[reply]

Appreciate the fix. Wasn't aware of a policy about hyphens. Care to point me to it? 70.247.173.254 (talk) 07:34, 25 July 2012 (UTC)[reply]

I don't think there's a policy, but there is info about normal English hyphenation: see Compound modifier and WP:HYPHEN. Dicklyon (talk) 21:50, 2 August 2012 (UTC)[reply]

I am going to invoke WP:COMMONNAME, since every source in the article uses "scale-space" with a hyphen, including two encyclopedias [3][4]. There is only one exception[5]. --Enric Naval (talk) 19:05, 2 August 2012 (UTC)[reply]

I looked at google scholar. "scale-space" uses a hyphen[6] but "scalar space" doesn't [7]. --Enric Naval (talk) 19:18, 2 August 2012 (UTC)[reply]

Editor User:Tpl has mostly referenced his own work, and he's the one that started the over-use of the hyphen, I think. He's not from a native English speaking country, so I assume he may have been less familiar with the subtleties of English grammar, and picked up on the hyphen in Witkin's original paper, where "scale-space" was used as an adjective. The Dutch largely followed him. Some books, like this one go the other extreme, and omit the hyphen everywhere. But there's really no reason why WP should follow either form of mis-punctuation, when it serves the user best to use English grammar and punctuation standards to help them parse the phrases correctly. Dicklyon (talk) 21:43, 2 August 2012 (UTC)[reply]

I undid the move you did, Enric, which was obviously could not have been considered uncontroversial in light of this discussion. Also verified that the noun/adj. phrase distinction is in Witkin's original and subsequent papers: [8] and [9]. The unhyphenated form was adopted by others independent of Witkin's team, like [10], [11] [12], [13]. It's just English; no reason to punish the reader by making the structure more ambiguous. Dicklyon (talk) 22:02, 2 August 2012 (UTC)[reply]

So, in "a scale space" it's a noun composed by two words, and in "a scale-space image" it's a a compound adjective that is modifying the noun "image". Yes, i can see how it works. And I see that Witkin's paper does make that distinction. OK, no problem with me, the current title is fine. --Enric Naval (talk) 22:33, 2 August 2012 (UTC)[reply]

Enric, I don't mind helping you learn a bit about English grammar and punctuation and such, but it has been a real pain trying to do that via RM arguments over the last year or two. Maybe this is a good step, where you actually learn the logic, instead of pretending that you can just copy what you find in sources and end up in a good place. Dicklyon (talk) 00:37, 5 August 2012 (UTC)[reply]

Move to Scale-space theory?[edit]

as per lead. fgnievinski (talk) 05:18, 8 September 2020 (UTC)[reply]

I am against such a move. The term "scale space" is often used as the shorter term.Tpl (talk) 12:13, 14 February 2021 (UTC)[reply]