Comments about dispersion of light waves

. Dispersion of light waves is well known, but the subject deserves some comments. Certain classical equations do not fully respect causality; as an example, group velocity v g is usually given as the ﬁ rst derivative of the angular frequency x with respect to the angular spatial frequency k m (or wavenumber) in the medium, whereas it is k m that depends on x . This paper also emphasizes the use of phase index n and group index n g , as inverse of their respective velocities, normalized to 1/ c , the inverse of free-space light velocity. This clari ﬁ es the understanding of dispersion equations: group dispersion parameter D is related to the ﬁ rst derivative of n g with respect to wavelength k , whilst group velocity dispersion GVD is also related to the ﬁ rst derivative of n g , but now with respect to angular frequency x . One notices that the term second order dispersion does not have the same meaning with k , or with x . In addition, two original and amusing geometrical constructions are proposed; they simply derive group index n g from phase index n with a tangent , which helps to visualize their relationship. This applies to bulk materials, as well as to optical ﬁ bers and waveguides, and this can be extended to birefringence and polarization mode dispersion in polarization-maintaining ﬁ bers or birefringent waveguides.


Introduction
The theory of dispersion of light waves is well known and can be found in many textbooks, as for example [1][2][3], but the way it is usually presented deserves some comments. For example, causality can be seen as not being fully respected in the basic equations, within the meaning of causal link between their parameters. This paper also emphasizes that the indexes are very convenient to understand more easily the equations describing dispersion. They should be viewed as inverses of velocity, normalized to 1/c, the inverse of light velocity c in a vacuum, knowing that there is no short word for these inverses that are involved in these equations. Furthermore, indexes are dimensionless values, which avoids dealing with units that are not always easily understandable, as we shall see.
In addition, two simple and amusing geometrical constructions are presented to derive group index n g from refractive index n with a tangent. To my knowledge, they are original. They apply to material dispersion, but also to guidance dispersion in a fiber or a waveguide, as well as to birefringence and intrinsic polarization mode dispersion (I-PMD) in polarization-maintaining (PM) fibers or birefringent waveguides.
Obviously, this does not prevent the use of Sellmeier's equation [4] and derivatives with a computer, to calculate precisely these indexes, but this brings a complementary view to visualize simply the question of dispersion, and the relationship between refractive index n and group index n g .

Comments regarding causality
The first analysis of dispersion was done by Newton, in the mid-17th century, with a prism that can separate the various colors of white light, because the refractive index n (or index of refraction) of a glass is not constant; it is called chromatic dispersion (chroma is color, in ancient Greek). At the beginning of the 19th century, with the work of Fresnel on the mathematical theory of diffraction, it was finally accepted by the scientific community that light is a wave, as proposed by Huygens at the end of the 17th century, and that the index n is the ratio between light velocity c in a vacuum and its lower velocity in a medium. Remember that Newtonian corpuscular theory stated the opposite, with light going faster in a medium than in a Journal of the European Optical Society-Rapid Publications vacuum, and that it required more than a century to get Huygens' wave model accepted, because of the tremendous prestige of Newton, as discussed by Aspect in a recent historical review paper [5]. In acoustics, it was proposed by Hamilton in the first half of the 19th century that a modulated wave has in fact two velocities: the sinusoidal carrier does propagate at the velocity of a continuous wave, called the phase velocity v u , but the modulation term propagates at the so-called group velocity v g . This concept of group velocity was later developed in full mathematical details by Rayleigh in his iconic book, The Theory of Sound [6], and group velocity was seen as the velocity of signal energy.
At the beginning of the 20th century, the application of this concept to optical waves raised many questions, because it could lead to a group velocity higher than c, in the case of high anomalous dispersion, i.e., when (dn/dk) ) 0, which was contradictory with the Theory of Relativity. This question was brilliantly solved by Sommerfeld and Brillouin in twin papers of 1914 [7,8]. An English version of these papers can be found in the very interesting Brillouin's book of 1960, Wave Propagation and Group Velocity [9]. The important result is that anomalous dispersion happens when there is absorption, as in the far ultra-violet for example, but with absorption, group velocity is not the velocity of signal energy anymore; then, it is not contradictory with Relativity.
Going back to the usual case of normal optical dispersion in transparent dielectric media, i.e., when (dn/dk) < 0, phase velocity v u = c/n and group velocity v g are classically given by [1][2][3]: where x is the angular (temporal) frequency, and k m is the angular spatial frequency in the medium. For k m = 2p/ k m = 2p Á n/k, with k m = k/n being the wavelength in the medium and k being the wavelength in a vacuum, the terms angular wave number, or wave number alone, are also used, but I prefer angular spatial frequency to outline the duality between the temporal domain and the spatial domain. I do not like much the use of the derivative dx/dk m for the definition of v g , since it does not respect the causal link between k m and x, that I call, in short, causality in this paper: it is k m that depends on x, and not x that depends on k m . You may say that dx/dk m and (dk m /dx) À1 are alike for Leibniz's notation of derivatives, beauty of math, but with Lagrange's notation, that I prefer here, it is less clear: x 0 (k m ) suggests a causality that would not be fully respected, whereas [k 0 m (x)] À1 does respect causality. To be more specific, you must agree that it is the index n (involved in k m = 2p Á n/k), that depends on the wavelength k in a vacuum (involved in x, with x = 2p Á c/k), and not the opposite; it is n(k) and not k(n), even if k, as a function of n, remains mathematically possible. Math does not care about causality and can inverse a function, but causality is fundamental in physics! Equation (1) should be: You may think that it is nit-picking but, to me, it is important, and I wanted to share this view. I am not the only one to think that, and equation (2) is found in several textbooks [10][11][12], even if they do not outline the difference between (1) and (2). It must be obvious for the authors of these references, but I am not sure that it is obvious for every reader of this article. Now, if a frequency is the inverse of a period, as we have all learned with music, there is no short word for the inverse of a velocity, that is involved in (2). The habit is to use temporal delay over a unit distance [3] or group delay per unit length [10], with t g = 1/v g , but it is not very concise. Slowness could have been possible, but it is not very positive wording. This can be overcome with the use of refractive index n, also called phase index since n = c/v u , and of group index n g = c/v g . They should be viewed as the inverse of a velocity, normalized to 1/c. Relativity physicists do normalize the time dependence c Á t of their equations with c = 1, to get the same temporal dimension as that of the spatial coordinates; we can do the inverse! So, there are: You may think that it is nit-picking again, or obvious math, but I prefer to consider that it is useful to understand dispersion better. Using group index n g , as defined in (3), it is simple to derive its relationship with phase index n. Since k m = n(x)Áx/c, there is: It is important to notice in (4) the derivative of the product [n(x) Á x]. It explains the difference between dispersion seen as a function of wavelength k, and seen as a function of angular frequency x, as we shall see later. Now from (4): With Lagrange's notation of derivative, instead of Leibniz's notation, this yields: This well-known equation is often written as a function of the wavelength k in a vacuum. Since x = 2p Á c/k, there is, by logarithmic differentiation, dx/x = Àdk/k, and then: or, with Lagrange's notation: It is often taught that there is no group velocity dispersion, when the second derivative n 00 (k), or d 2 n/dk 2 , is equal to 0; for silica fibers, it is for k = 1.3 lm. Geometrically, this corresponds to an inflexion point of the refractive index curve n, as a function of wavelength k (Fig. 1). It is mathematically true but, again, it does not fully respect causality. The cause is that there is no group dispersion when group velocity v g , as well as group index n g , are constant, i.e., when its first derivative dn g /dk, or n 0 g (k), is equal to zero. It happens to be the case when the second derivative of the refractive (or phase) index, d 2 n/dk 2 , or n 00 (k), equates zero, but it is only a consequence of the fact that dn g /dk = Àk Á d 2 n/dk 2 ; it is not the original cause which is dn g /dk equates zero.
This result influenced a vocabulary that must be handled with care. Chromatic dispersion, also known as phase velocity dispersion, i.e., first derivative of phase index dn/dk 6 ¼ 0, is often called first-order dispersion, and group velocity dispersion, i.e., first derivative of group index dn g /dk 6 ¼ 0, but also second derivative of phase index d 2 n/dk 2 6 ¼ 0, is often called second-order dispersion. So, I have another comment: if one considers frequencies (angular, x, or regular temporal, f = x/2p, or spatial, r = 1/k), this does not work anymore; this works only with period, i.e., wavelength k or temporal period T = 1/f. No group velocity dispersion means that dn g /dx (or dn g /df, or dn g /dr) is equal to zero, since it is the basic cause, but from (5): Therefore, when dn g /dx = 0, the second derivative d 2 n/dx 2 of the phase index equates À(2Ádn/dx)/x, and it is not null. There are similar equations with f and r, since df/f = dr/r = dx/x, by logarithmic differentiation. Nevertheless, it is possible to view second-order dispersion with frequency. It is not as visual as for wavelength, with the inflexion point of the refractive index dependence, but it remains simple mathematically. We saw in (2) that t g = 1/v g = dk m /dx, and group dispersion is the first derivative of this group delay t g over a unit distance, i.e., 1/v g . With frequency, it is called group velocity dispersion (GVD), and: Since k m = 2p Á n/k = [(n(x)Áx]/c, there is no second order dispersion when the second derivative of the product n (x) Á x is null, i.e., when d 2 k m /dx 2 = (1/c)Ád 2 [n(x)Áx]/ dx 2 = 0, whereas, with k, it is when the second derivative of the refractive index n(k) alone is null, i.e. d 2 n/dk 2 , as we already saw.
To promote, again, indexes as normalized inverse velocities, I have an additional comment with what is called group dispersion parameter, D, and what is called, we just saw, group velocity dispersion, GVD, as you find on refractiveindex.info web site, for example. D is classically expressed in ps/(nm km), and it is simply [3]: Using the first derivative of n g (k), this yields: GVD is similar, but it is using the derivative with respect to x, instead of k: Using the first derivative of n g (x), this yields: Its unit is square second per meter. This is concise but not easily understandable. It would be clearer to use s/[(rad/ s) m], or s 2 /(rad m), a radian per second being the unit of x, even if a radian is a dimensionless unit that can be omitted. This is a complicated question, by the way, to decide if a dimensionless unit can be omitted or not?
In silica, at 1550 nm, dn/dk % À0.012 lm À1 and dn g /dk % +0.007 lm À1 (Fig. 2). I like these values that have a simple unit, and that you can check on index curves. Now, since 1/c is about one nanosecond per foot (a very useful value that I learned and still remember from my postdoc at Stanford University, in the early 1980s), group dispersion parameter D is about +23 ps/(nm km), and group velocity dispersion GVD is about À28 ps 2 /km (or fs 2 /mm).
To understand better ps 2 /km, the unit of GVD, and to compare it with ps/(nm km), the unit of D, one must see that in "ps 2 ", there are a first "ps" for the delay, as in ps/(nm km), and a second "ps" that is actually 1/(10 +12 rad/s), with the omission of dimensionless radian. At 1550 nm, where the temporal frequency f is 193.5 THz, the angular frequency, x = 2p Á f, is about 1.2 Â 10 +15 rad/s; then, a value of 10 +12 rad/s for the shift Dx, is around 0.1% of x. The "nm" used in D for Dk is also a shift of about 0.1% of the wavelength, that is in the lm range, and then, the two numerical values, 23 and 28, are close. One femtosecond, for GVD, is about equivalent to one nanometer to the À1 power, for D.
In any case, ps 2 /km remains a strange unit to me. As a teaser, it would have been also possible to use a concise and strange unit for D: ps/mm 2 , since 1 nm Â 1 km = 1 mm Â 1 mm, but with mm 2 , it looks like an area! Finally, remember that D and GVD have opposite signs, since x = 2p Á c/k, and then, dx/dk is negative, as well as dk/dx. To talk about positive or negative group dispersion requires to specify if it is with respect to period, or to frequency. To avoid this problem, it is customary to use for group velocity dispersion the same vocabulary as the one used for phase velocity dispersion. As we saw, normal dispersion corresponds to a negative derivative of the index with respect to k, and anomalous dispersion corresponds to 3 lm, there is no group velocity dispersion since n g is constant, i.e., dn g /dk is null. This is the basic cause; the fact that d 2 n/dk 2 = 0 at 1.3 lm is just a consequence of dn g /dk = Àk Á d 2 n/dk 2 . a positive derivative. However, I am not very fond of this vocabulary for group dispersion, even if it is convenient. Anomalous has an understandable meaning with phase velocity dispersion, since it is when the medium is absorbing, and it is not the normal use in the transparency window. With group velocity dispersion, a positive group index derivative is not Anomalous, nor abnormal, strictly speaking. Taking the case of silica, it is simply above 1.3 lm, as seen in Figure 2! 3 Simple geometrical constructions to derive n g from n Let us consider the theoretical case of a material without any group dispersion. As we saw, dn g /dk = 0 implies that d 2 n/dk 2 = 0. The curve of the refractive (or phase) index n as a function of wavelength k is a simple affine function, with a constant slope, equal to dn/dk, since d 2 n/dk 2 = 0: Because the group index n g (k) is equal to n(k) -(k Á dn/dk), as seen in (7), there is: Group index n g is constant and equal to the value n(0) of the phase index for a null wavelength (Fig. 3).
As it can be simply visualized with this Figure 3, normal dispersion, i.e., a negative slope dn/dk, yields a group index n g higher than the phase index n. Conversely, anomalous dispersion, i.e., a positive slope dn/dk, yields a group index n g lower than the phase index n, and one can easily understand that with a steep positive slope, group index can become lower than one, and even negative, which was obviously a problem with the theory of Relativity, but this was solved by Sommerfeld and Brillouin, as we already saw [7][8][9]. Now, with the practical case of a material with group dispersion, one must consider the tangent to the curve n (k). Using Lagrange's notation, which is clearer, here, than Leibniz's notation, the equation T 0 (k) of this tangent, for a given wavelength k 0 , is: Then, it is simple to see with (7) again, that for a null wavelength: The tangent T 0 (k) to the curve n(k), at k 0 , crosses the ordinate axis, i.e., when k is zero, at the value n g (k 0 ) of  the group index for k 0 (Fig. 4). It is simple math, but it is amusing and, also, very useful to visualize simply the relationship between n and n g . Knowing this, Figure 1 can be revisited, even if silica is not transparent anymore below 0.15 lm (Fig. 5). Mathematically, it remains possible to continue the tangent toward zero. Math does not care about causality, as we already saw, nor does it care about transparency! 4 The case of a single-mode optical fiber As we saw, a CW light wave propagates at a phase velocity v u = x/k m (x), in a bulk medium. In an optical fiber, there are discrete modes that propagate at different phase veloc- 3,10]. These propagation constants depend on the angular temporal frequency x, and they have an intermediate value between the angular spatial frequency k m2 in the cladding of refractive index n 2 , and k m1 , the one in the core of refractive index n 1 . In the single-mode regime, the high-order modes are above cut-off, and the fundamental mode is the only one that can propagate. Its propagation constant b(x) follows: It is very convenient to use the so-called effective index n eff of the mode defined with: Following (19), n eff has an intermediate value between n 2 and n 1 , and like b it depends on frequency: The angular (temporal) frequency x is very useful to shorten mathematical equations, but everybody is more familiar with the wavelength k in a vacuum. You know what is 1 lm, whereas 1 rad/s is not that obvious, besides the fact that it corresponds to 0.16 Hz. Therefore, the frequency dependence of n eff is classically presented with respect to k c /k, where k c is the cut-off wavelength of the second-order mode. This ratio does correspond to a frequency: it is the spatial frequency 1/k normalized to 1/k c . This effective index n eff is a phase index, and all the mathematical equations of Section 2 can be used to find the effective group index n g-eff . It is also possible to use the geometrical construction of Section 3, that relates a group index to its corresponding phase index.
The way to proceed is to invert the classical curve n eff (k c /k), and to get the inverted curve n eff (k/k c ), that depends on k, and not on 1/k anymore (Fig. 6).
It is interesting to notice that this inverted curve is easier to understand, for the fundamental mode, than the classical one: as the wavelength increases, the mode widens [2], and it expands more in the cladding, which decreases the effective index n eff . However, the classical representation remains better with several modes, since their effective index curves are spread about evenly in frequency, which would not be the case with the inverted representation.
Now, once you have this inverted representation in wavelength, you must just apply the geometrical construction of Figure 4, to find the value of the effective group index n g-eff (Fig. 7).
Note, however, that it is possible to find also a geometrical construction with the frequency dependence. It is not as simple as with the period dependence, but it has some interest. The equation of the tangent T 0 (x) to the curve n(x), for a given angular frequency x 0, is: with: The slope of this tangent is n 0 (x 0 ). Consider now the affine function DS 0 (x) starting from T 0 (0), and having a double slope, i.e., a slope equal to twice that of the tangent T 0 (x): Following (24), there is: As seen in (6), the group index follows n g (x) = n(x) + [x Á n 0 (x)], then: With k, we saw that the tangent to the curve n(k), at k 0 , crosses the ordinate axis at n g (k 0 ). With x, one draws a double-slope line DS 0 (x) (Fig. 8). In the case of the fundamental mode of a fiber, the double-slope construction can be used with the classical curve n eff (k c /k) seen in Figure 6a, as shown in Figure 9.
Obviously, this geometrical construction can also be used for high-order modes, as well as for integrated-optic waveguides. Fig. 4. Refractive (or phase) index n(k) (solid line curve), with group dispersion. The tangent T 0 (k) to the curve n(k) at k 0 crosses the ordinate axis corresponding to k = 0, at the value n g (k 0 ) of the group index for k 0 .    7. Geometrical construction to derive the effective group index n g-eff (k/k c ) of the fundamental mode (solid line curve) from its effective phase index n eff (k/k c ) (dashed line curve). The tangent to n eff (k/k c ) crosses the ordinate axis at the value of the corresponding effective group index n g-eff (k/k c ), as shown for k = 1.25 k c . One sees easily that n g-eff is about equal to the refractive index of the core n 1 , in the practical domain of use of a single-mode fiber, i.e., k c < k < 1.5 k c ; above 1.5 k c , the mode starts to widen a lot and the curvature loss increases drastically.

The case of polarization mode dispersion in a PM fiber
The use of phase and group indexes as normalized inverses of velocity, as well as the geometrical constructions, that were presented, are also very useful for birefringence and intrinsic polarization mode dispersion (intrinsic PMD) [3] of high-birefringence polarization-maintaining (PM) fibers.
Phase birefringence B, or modal birefringence, or simply birefringence, is the difference between the phase effective index n slow of the slow polarization mode, and n fast , that of the fast mode. In addition, it is very convenient to use the concept of group birefringence B g , instead of intrinsic PMD; this simplifies equations. B g is the difference between the group indexes n g-slow and n g-fast : B ¼ n slow À n fast and B g ¼ n gÀslow À n gÀfast : ð28Þ Fig. 8. Two possible geometrical constructions for relating group index n g to phase index n: (a) with the dependence in wavelength k, i.e., the spatial period of the wave, the tangent T 0 (k) to the phase index curve crosses the ordinate axis at the value n g (k 0 ) of the group index; (b) with the dependence in angular frequency x, a double-slope line DS 0 (x) is drawn from where the tangent T 0 (x) to the phase index curve crosses the ordinate axis, and the group index n g (x 0 ) equates DS 0 (x 0 ). Fig. 9. Geometrical construction relating the effective group index n g-eff to the effective index n eff of the fundamental mode of a fiber, with the dependence in normalized spatial frequency k c /k. A double-slope line is drawn from where the tangent to the effective index curve crosses the ordinate axis, and the group index n g (k c /k 0 ) equates DS 0 (k c /k 0 ), as shown for k c /k 0 = 0.8, i.e., for k 0 = 1.25 k c . As in Figure 7, one easily sees that n g-eff is about equal to the refractive index of the core n 1 , in the practical domain of use of a single-mode fiber, i.e., k c < k < 1.5 k c , or 0.67 < k c /k < 1.
The intrinsic PMD i of high-birefringence PM fibers is the difference between the group delays per unit length of both modes [3]: which yields a very simple equation: knowing that 1/c is about one nanosecond per foot, as we already saw. Group birefringence is actually what is called intrinsic PMD, but normalized to 1/c. Typical group birefringence B g of PM fibers is around 5 Â 10 À4 , and again it is a dimensionless value, which yields an intrinsic PMD on the order of 1.5-2 ps/m. Intrinsic PMD is related to phase birefringence dispersion: when dB/dk 6 ¼ 0, group birefringence B g is different from phase birefringence B; it is a first-order dispersion. There is in addition group birefringence dispersion, when the first derivative dB g /dk 6 ¼ 0. As with group index, it is also when the second derivative d 2 B/dk 2 6 ¼ 0, since dB g /dk = Àk Á d 2 B/dk 2 . It is a second-order dispersion that must be considered in certain cases as, for example, with optical coherence-domain polarimetry (OCDP) also called distributed polarization crosstalk analysis (DPXA) [13,14].

Conclusion
This paper presents comments and geometrical constructions that should ease the understanding of dispersion, which is not always explained simply in textbooks. The points to remember are: -Group velocity is classically given with v g = dx/dk m , i.e., x 0 (k m ), but this equation does not fully respect causality, within the meaning of causal link between its parameters. -It is better to use 1/v g = dk m /dx, i.e., k 0 m (x). To be more specific, it is n(k) yielding dn/dk, and not k(n) yielding dk/dn.
-Phase and group indexes should be viewed as the inverse of their respective velocity, normalized to 1/c. This yields clearer equations for group dispersion parameter, D = (1/c) Á dn g /dk, and for group velocity dispersion, GVD = (1/c) Á dn g /dx. In addition, indexes are dimensionless units, which avoids dealing with units that are not always easily understandable. -The term second-order dispersion, for group velocity dispersion, should be used carefully. There is group velocity dispersion when the group index n g is not constant, which is the basic cause, causality again. It is the case when the second derivative of the phase index with respect to k, d 2 n/dk 2 , is not null, but it is only a consequence of dn g /dk = Àk Á d 2 n/dk 2 . With x, it does not work anymore; it is when d 2 (n Á x)/dx 2 is not null, and not d 2 n/dx 2 , that there is group dispersion. -The unit of GVD, ps 2 /km or fs 2 /mm, remains strange to me, and I should not be the only one. -The two simple geometrical constructions that relate group index n g to phase index n, with the tangent, are very useful to visualize simply their relationship, and they are new, to my knowledge. You must have noticed that I prefer clear geometrical figures using simple math to complicated equations. -These comments and these geometrical constructions also apply to guidance dispersion in optical fiber and integrated-optic waveguide. -These comments and these geometrical constructions can be used for birefringence and intrinsic polarization mode dispersion in PM fibers or birefringent integratedoptic waveguides, when the very convenient concept of group birefringence is used.
I hope that this paper will be useful and help to clarify the subject. It is just based on what I had not fully understood about dispersion, over the forty-five years of my career, as you can check in [15,16], where I classically used (1). I understand it better today, and I wanted to share it. And as you did notice, I do like the simple and amusing geometrical constructions with the tangent, which is the reason for this paper, even if I did not resist adding some comments that might look slightly provocative, but nevertheless important, I think.