2.2 Markdown extensions by bookdown
Although Pandoc’s Markdown is much richer than the original Markdown syntax, it still lacks a number of things that we may need for academic writing. For example, it supports math equations, but you cannot number and reference equations in multi-page HTML or EPUB output. We have provided a few Markdown extensions in bookdown to fill the gaps.
2.2.1 Number and reference equations
To number and refer to equations, put them in the equation environments and assign labels to them using the syntax (\#eq:label)
, e.g.,
\begin{equation}
f\left(k\right) = \binom{n}{k} p^k\left(1-p\right)^{n-k}
(\#eq:binom)
\end{equation}
It renders the equation below:
\[\begin{equation} f\left(k\right)=\binom{n}{k}p^k\left(1-p\right)^{n-k} \tag{2.1} \end{equation}\]
You may refer to it using \@ref(eq:binom)
, e.g., see Equation (2.1).
eq:
in bookdown. All labels in bookdown must only contain alphanumeric characters, :
, -
, and/or /
. Equation references work best for LaTeX/PDF output, and they are not well supported in Word output or e-books. For HTML output, bookdown can only number the equations with labels. Please make sure equations without labels are not numbered by either using the equation*
environment or adding \nonumber
or \notag
to your equations. The same rules apply to other math environments, such as eqnarray
, gather
, align
, and so on (e.g., you can use the align*
environment).
We demonstrate a few more math equation environments below. Here is an unnumbered equation using the equation*
environment:
\begin{equation*}
\frac{d}{dx}\left( \int_{a}^{x} f(u)\,du\right)=f(x)
\end{equation*}
\[\begin{equation*} \frac{d}{dx}\left( \int_{a}^{x} f(u)\,du\right)=f(x) \end{equation*}\]
Below is an align
environment (2.2):
\begin{align}
g(X_{n}) &= g(\theta)+g'({\tilde{\theta}})(X_{n}-\theta) \notag \\
\sqrt{n}[g(X_{n})-g(\theta)] &= g'\left({\tilde{\theta}}\right)
\sqrt{n}[X_{n}-\theta ] (\#eq:align)
\end{align}
\[\begin{align} g(X_{n}) &= g(\theta)+g'({\tilde{\theta}})(X_{n}-\theta) \notag \\ \sqrt{n}[g(X_{n})-g(\theta)] &= g'\left({\tilde{\theta}}\right) \sqrt{n}[X_{n}-\theta ] \tag{2.2} \end{align}\]
You can use the split
environment inside equation
so that all lines share the same number (2.3). By default, each line in the align
environment will be assigned an equation number. We suppressed the number of the first line in the previous example using \notag
. In this example, the whole split
environment was assigned a single number.
\begin{equation}
\begin{split}
\mathrm{Var}(\hat{\beta}) & =\mathrm{Var}((X'X)^{-1}X'y)\\
& =(X'X)^{-1}X'\mathrm{Var}(y)((X'X)^{-1}X')'\\
& =(X'X)^{-1}X'\mathrm{Var}(y)X(X'X)^{-1}\\
& =(X'X)^{-1}X'\sigma^{2}IX(X'X)^{-1}\\
& =(X'X)^{-1}\sigma^{2}
\end{split}
(\#eq:var-beta)
\end{equation}
\[\begin{equation} \begin{split} \mathrm{Var}(\hat{\beta}) & =\mathrm{Var}((X'X)^{-1}X'y)\\ & =(X'X)^{-1}X'\mathrm{Var}(y)((X'X)^{-1}X')'\\ & =(X'X)^{-1}X'\mathrm{Var}(y)X(X'X)^{-1}\\ & =(X'X)^{-1}X'\sigma^{2}IX(X'X)^{-1}\\ & =(X'X)^{-1}\sigma^{2} \end{split} \tag{2.3} \end{equation}\]
2.2.2 Theorems and proofs
Theorems and proofs are commonly used in articles and books in mathematics. However, please do not be misled by the names: a “theorem” is just a numbered/labeled environment, and it does not have to be a mathematical theorem (e.g., it can be an example irrelevant to mathematics). Similarly, a “proof” is an unnumbered environment. In this section, we always use the general meanings of a “theorem” and “proof” unless explicitly stated.
In bookdown, the types of theorem environments supported are in Table 2.1. To write a theorem, you can use the syntax below:
::: {.theorem}`theorem` environment that can contain **any**
This is a
_Markdown_ syntax. :::
This syntax is based on Pandoc’s fenced Div
blocks and can already be used in any R Markdown document to write custom blocks. Bookdown only offers special handling for theorem and proof environments. Since this uses the syntax of Pandoc’s Markdown, you can write any valid Markdown text inside the block.
Environment | Printed Name | Label Prefix |
---|---|---|
theorem | Theorem | thm |
lemma | Lemma | lem |
corollary | Corollary | cor |
proposition | Proposition | prp |
conjecture | Conjecture | cnj |
definition | Definition | def |
example | Example | exm |
exercise | Exercise | exr |
hypothesis | Hypothesis | hyp |
To write other theorem environments, replace ::: {.theorem}
with other environment names in Table 2.1, e.g., ::: {.lemma}
.
A theorem can have a name
attribute so its name will be printed. For example,
::: {.theorem name="Pythagorean theorem"}
For a right triangle, if $c$ denotes the length of the hypotenuse
and $a$ and $b$ denote the lengths of the other two sides, we have
$$a^2 + b^2 = c^2$$ :::
If you want to refer to a theorem, you should label it. The label can be provided as an ID to the block of the form #label
. For example,
::: {.theorem #foo}
A labeled theorem here. :::
After you label a theorem, you can refer to it using the syntax \@ref(prefix:label)
. See the column Label Prefix
in Table 2.1 for the value of prefix
for each environment. For example, we have a labeled and named theorem below, and \@ref(thm:pyth)
gives us its theorem number 2.1:
::: {.theorem #pyth name="Pythagorean theorem"}
For a right triangle, if $c$ denotes the length of the hypotenuse
and $a$ and $b$ denote the lengths of the other two sides, we have
$$a^2 + b^2 = c^2$$ :::
Theorem 2.1 (Pythagorean theorem) For a right triangle, if \(c\) denotes the length of the hypotenuse and \(a\) and \(b\) denote the lengths of the other two sides, we have
\[a^2 + b^2 = c^2\]
The proof environments currently supported are proof
, remark
, and solution
. The syntax is similar to theorem environments, and proof environments can also be named using the name
attribute. The only difference is that since they are unnumbered, you cannot reference them, even if you provide an ID to a proof environment.
We have tried to make all these theorem and proof environments work out of the box, no matter if your output is PDF or HTML. If you are a LaTeX or HTML expert, you may want to customize the style of these environments anyway (see Chapter 4). Customization in HTML is easy with CSS, and each environment is enclosed in <div></div>
with the CSS class being the environment name, e.g., <div class="lemma"></div>
. For LaTeX output, we have predefined the style to be definition
for environments definition
, example
, exercise
, and hypothesis
, and remark
for environments proof
and remark
. All other environments use the plain
style. The style definition is done through the \theoremstyle{}
command of the amsthm package. If you do not want the default theorem definitions to be automatically added by bookdown, you can set options(bookdown.theorem.preamble = FALSE)
. This can be useful, for example, to avoid conflicts in single documents (Section 3.4) using the output format bookdown::pdf_book
with a base_format
that has already included amsmath definitions.
Theorems are numbered by chapters by default. If there are no chapters in your document, they are numbered by sections instead. If the whole document is unnumbered (the output format option number_sections = FALSE
), all theorems are numbered sequentially from 1, 2, …, N. LaTeX supports numbering one theorem environment after another, e.g., let theorems and lemmas share the same counter. This is not supported for HTML/EPUB output in bookdown. You can change the numbering scheme in the LaTeX preamble by defining your own theorem environments, e.g.,
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}[theorem]{Lemma}
When bookdown detects \newtheorem{theorem}
in your LaTeX preamble, it will not write out its default theorem definitions, which means you have to define all theorem environments by yourself. For the sake of simplicity and consistency, we do not recommend that you do this. It can be confusing when your Theorem 18 in PDF becomes Theorem 2.4 in HTML.
Below we show more examples4 of the theorem and proof environments, so you can see the default styles in bookdown.
Definition 2.1 The characteristic function of a random variable \(X\) is defined by
\[\varphi _{X}(t)=\operatorname {E} \left[e^{itX}\right], \; t\in\mathcal{R}\]
Example 2.1 We derive the characteristic function of \(X\sim U(0,1)\) with the probability density function \(f(x)=\mathbf{1}_{x \in [0,1]}\).
\[\begin{equation*} \begin{split} \varphi _{X}(t) &= \operatorname {E} \left[e^{itX}\right]\\ & =\int e^{itx}f(x)dx\\ & =\int_{0}^{1}e^{itx}dx\\ & =\int_{0}^{1}\left(\cos(tx)+i\sin(tx)\right)dx\\ & =\left.\left(\frac{\sin(tx)}{t}-i\frac{\cos(tx)}{t}\right)\right|_{0}^{1}\\ & =\frac{\sin(t)}{t}-i\left(\frac{\cos(t)-1}{t}\right)\\ & =\frac{i\sin(t)}{it}+\frac{\cos(t)-1}{it}\\ & =\frac{e^{it}-1}{it} \end{split} \end{equation*}\]
Note that we used the fact \(e^{ix}=\cos(x)+i\sin(x)\) twice.
Lemma 2.1 For any two random variables \(X_1\), \(X_2\), they both have the same probability distribution if and only if
\[\varphi _{X_1}(t)=\varphi _{X_2}(t)\]
Theorem 2.2 If \(X_1\), …, \(X_n\) are independent random variables, and \(a_1\), …, \(a_n\) are some constants, then the characteristic function of the linear combination \(S_n=\sum_{i=1}^na_iX_i\) is
\[\varphi _{S_{n}}(t)=\prod_{i=1}^n\varphi _{X_i}(a_{i}t)=\varphi _{X_{1}}(a_{1}t)\cdots \varphi _{X_{n}}(a_{n}t)\]
Proposition 2.1 The distribution of the sum of independent Poisson random variables \(X_i \sim \mathrm{Pois}(\lambda_i),\: i=1,2,\cdots,n\) is \(\mathrm{Pois}(\sum_{i=1}^n\lambda_i)\).
Proof. The characteristic function of \(X\sim\mathrm{Pois}(\lambda)\) is \(\varphi _{X}(t)=e^{\lambda (e^{it}-1)}\). Let \(P_n=\sum_{i=1}^nX_i\). We know from Theorem 2.2 that
\[\begin{equation*} \begin{split} \varphi _{P_{n}}(t) & =\prod_{i=1}^n\varphi _{X_i}(t) \\ & =\prod_{i=1}^n e^{\lambda_i (e^{it}-1)} \\ & = e^{\sum_{i=1}^n \lambda_i (e^{it}-1)} \end{split} \end{equation*}\]
This is the characteristic function of a Poisson random variable with the parameter \(\lambda=\sum_{i=1}^n \lambda_i\). From Lemma 2.1, we know the distribution of \(P_n\) is \(\mathrm{Pois}(\sum_{i=1}^n\lambda_i)\).
Remark. In some cases, it is very convenient and easy to figure out the distribution of the sum of independent random variables using characteristic functions.
Corollary 2.1 The characteristic function of the sum of two independent random variables \(X_1\) and \(X_2\) is the product of characteristic functions of \(X_1\) and \(X_2\), i.e.,
\[\varphi _{X_1+X_2}(t)=\varphi _{X_1}(t) \varphi _{X_2}(t)\]
Exercise 2.1 (Characteristic Function of the Sample Mean) Let \(\bar{X}=\sum_{i=1}^n \frac{1}{n} X_i\) be the sample mean of \(n\) independent and identically distributed random variables, each with characteristic function \(\varphi _{X}\). Compute the characteristic function of \(\bar{X}\).
Solution. Applying Theorem 2.2, we have
\[\varphi _{\bar{X}}(t)=\prod_{i=1}^n \varphi _{X_i}\left(\frac{t}{n}\right)=\left[\varphi _{X}\left(\frac{t}{n}\right)\right]^n.\]
Hypothesis 2.1 (Riemann hypothesis) The Riemann Zeta-function is defined as \[\zeta(s) = \sum_{n=1}^{\infty} \frac{1}{n^s}\] for complex values of \(s\) and which converges when the real part of \(s\) is greater than 1. The Riemann hypothesis is that the Riemann zeta function has its zeros only at the negative even integers and complex numbers with real part \(1/2\).
2.2.2.1 A note on the old syntax
For older versions of bookdown (before v0.21), a theorem
environment could be written like this:
```{theorem pyth, name="Pythagorean theorem"}
For a right triangle, if $c$ denotes the length of the hypotenuse
and $a$ and $b$ denote the lengths of the other two sides, we have
$$a^2 + b^2 = c^2$$
```
This syntax still works, but we do not recommend it since the new syntax allows you to write richer content and has a cleaner implementation. However, note that the old syntax has to be used if you want the environment to work with output formats in addition to HTML and PDF, such as EPUB. The fenced Div
syntax only works for HTML and PDF output at the moment, and we will try to improve it in the future.
This conversion between the two syntaxes is straightforward. The above theorem could be rewritten in this way:
::: {.theorem #pyth name="Pythagorean theorem"}
For a right triangle, if $c$ denotes the length of the hypotenuse
and $a$ and $b$ denote the lengths of the other two sides, we have
$$a^2 + b^2 = c^2$$ :::
You can use the helper function bookdown::fence_theorems()
to convert a whole file or a piece of text. This is a one-time operation. We have tried to do the conversion from old to new syntax safely, but we might have missed some edge cases. To make sure you do not overwrite the input
file by accident, you can write the converted source to a new file, e.g.,
::fence_theorems("01-intro.Rmd", output = "01-intro-new.Rmd") bookdown
Then double check the content of 01-intro-new.Rmd
. Using output = NULL
will print the result of conversion in the R console, and is another way to check the conversion. If you are using a control version tool, you can set output
to be the same as input
, as it should be safe and easy for you to revert the change if anything goes wrong.
2.2.3 Special headers
There are a few special types of first-level headers that will be processed differently in bookdown. The first type is an unnumbered header that starts with the token (PART)
. This kind of headers are translated to part titles. If you are familiar with LaTeX, this basically means \part{}
. When your book has a large number of chapters, you may want to organize them into parts, e.g.,
# (PART) Part I {-}
# Chapter One
# Chapter Two
# (PART) Part II {-}
# Chapter Three
A part title should be written right before the first chapter title in this part, both title in the same document. You can use (PART\*)
(the backslash before *
is required) instead of (PART)
if a part title should not be numbered.
The second type is an unnumbered header that starts with (APPENDIX)
, indicating that all chapters after this header are appendices, e.g.,
# Chapter One
# Chapter Two
# (APPENDIX) Appendix {-}
# Appendix A
# Appendix B
The numbering style of appendices will be automatically changed in LaTeX/PDF and HTML output (usually in the form A, A.1, A.2, B, B.1, …). This feature is not available to e-books or Word output.
2.2.4 Text references
You can assign some text to a label and reference the text using the label elsewhere in your document. This can be particularly useful for long figure/table captions (Section 2.4 and 2.5), in which case you normally will have to write the whole character string in the chunk header (e.g., fig.cap = "A long long figure caption."
) or your R code (e.g., kable(caption = "A long long table caption.")
). It is also useful when these captions contain special HTML or LaTeX characters, e.g., if the figure caption contains an underscore, it works in the HTML output but may not work in LaTeX output because the underscore must be escaped in LaTeX.
The syntax for a text reference is (ref:label) text
, where label
is a unique label5 throughout the document for text
. It must be in a separate paragraph with empty lines above and below it. The paragraph must not be wrapped into multiple lines, and should not end with a white space. For example,
(ref:foo) Define a text reference **here**.
Then you can use (ref:foo)
in your figure/table captions. The text can contain anything that Markdown supports, as long as it is one single paragraph. Here is a complete example:
A normal paragraph.
`cars` using **base** R graphics.
(ref:foo) A scatterplot of the data
```{r foo, fig.cap='(ref:foo)'}
plot(cars) # a scatterplot
```
Text references can be used anywhere in the document (not limited to figure captions). It can also be useful if you want to reuse a fragment of text in multiple places.