Webpack前端构建⼯具原理剖析

⼀、什么是webpack

webpack是⼀个打包⼯具,他的宗旨是⼀切静态资源皆可打包。

image-20210720144103400

⼆、原型分析

⾸先我们通过⼀个制作⼀个打包⽂件的原型。

假设有两个js模块,这⾥我们先假设这两个模块是复合commomjs标准的es5模块。

我们的⽬的是将这两个模块打包为⼀个能在浏览器端运⾏的⽂件,这个⽂件其实叫bundle.js。

⽐如

1
2
3
4
5
// index.js
var add = require('add.js').default
console.log(add(1 , 2))
// add.js
exports.default = function(a,b) {return a + b}

假设在浏览器中直接执⾏这个程序肯定会有问题 主要的问题是浏览器中没有exports对象与require⽅法所以⼀定会报错。

我们需要通过模拟exports对象和require⽅法

1.模拟exports对象

⾸先我们知道如果在nodejs打包的时候我们会使⽤sfs.readfileSync()来读取js⽂件。这样的话js⽂件会是
⼀个字符串。⽽如果需要将字符串中的代码运⾏会有两个⽅法分别是new Function与Eval。
在这⾥⾯我们选⽤执⾏效率较⾼的eval。

1
2
3
exports = {}
eval('exports.default = function(a,b) {return a + b}') // node⽂件读取后的代码字符串
exports.default(1,3)

image-20210720144245345

上⾯这段代码的运⾏结果可以将模块中的⽅法绑定在exports对象中。由于⼦模块中会声明变量,为了不污染全局我们使⽤⼀个⾃运⾏函数来封装⼀下。

1
2
3
4
var exports = {}
(function (exports, code) {
eval(code)
})(exports, 'exports.default = function(a,b){return a + b}')

2.模拟require函数

require函数的功能⽐较简单,就是根据提供的file名称加载对应的模块。

⾸先我们先看看如果只有⼀个固定模块应该怎么写。

1
2
3
4
5
6
7
8
9
function require(file) {
var exports = {};
(function (exports, code) {
eval(code)
})(exports, 'exports.default = function(a,b){return a + b}')
return exports
}
var add = require('add.js').default
console.log(add(1 , 2))

完成了固定模块,我们下⾯只需要稍加改动,将所有模块的⽂件名和代码字符串整理为⼀张key-value表,就可以根据传⼊的⽂件名加载不同的模块了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(function (list) {
function require(file) {
var exports = {};
(function (exports, code) {
eval(code);
})(exports, list[file]);
return exports;
}
require("index.js");
})({
"index.js": `
var add = require('add.js').default
console.log(add(1 , 2))
`,
"add.js": `exports.default = function(a,b){return a + b}`,
});

当然要说明的⼀点是真正webpack⽣成的bundle.js⽂件中还需要增加模块间的依赖关系。

叫做依赖图(Dependency Graph)

类似下⾯的情况。

1
2
3
4
5
6
7
8
9
10
{
"./src/index.js": {
"deps": { "./add.js": "./src/add.js" },
"code": "....."
},
"./src/add.js": {
"deps": {},
"code": "......"
}
}

另外,由于⼤多数前端程序都习惯使⽤es6语法所以还需要预先将es6语法转换为es5语法。

总结⼀下思路,webpack打包可以分为以下三个步骤:

  1. 分析依赖
  2. ES6转ES5
  3. 替换exports和require

三、功能实现

我们的⽬标是将以下两个个互相依赖的ES6Module打包为⼀个可以在浏览器中运⾏的⼀个JS⽂件(bundle.js)

处理模块化
多模块合并打包 - 优化⽹络请求

/src/add.js

1
export default (a, b) => a + b

/src/index.js

1
2
import add from "./add.js";
console.log(add(1 , 2))

1.分析模块

分析模块分为以下三个步骤:

模块的分析相当于对读取的⽂件代码字符串进⾏解析。这⼀步其实和⾼级语⾔的编译过程⼀致。需要将模块解析为抽象语法树AST。我们借助babel/parser来完成。

AST (Abstract Syntax Tree)抽象语法树 在计算机科学中,或简称语法树(Syntax tree),是源代码语法结构的⼀种抽象表示。它以树状的形式表现编程语⾔的语法结构,树上的每个节点都表示源代码中的⼀种结构。(https://astexplorer.net/)

1
2
3
4
yarn add @babel/parser
yarn add @babel/traverse
yarn add @babel/core
yarn add @babel/preset-env
  • 读取⽂件
  • 收集依赖
  • 编译与AST解析
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
const fs = require("fs");
const path = require("path");
const parser = require("@babel/parser");
const traverse = require("@babel/traverse").default;
const babel = require("@babel/core");
function getModuleInfo(file) {
// 读取⽂件
const body = fs.readFileSync(file, "utf-8");
// 转化AST语法树
const ast = parser.parse(body, {
sourceType: "module", //表示我们要解析的是ES模块
});
// 依赖收集
const deps = {};
traverse(ast, {
ImportDeclaration({ node }) {
const dirname = path.dirname(file);
const abspath = "./" + path.join(dirname, node.source.value);
deps[node.source.value] = abspath;
},
});
运⾏结果如下:
2. 收集依赖
上⼀步开发的函数可以单独解析某⼀个模块,这⼀步我们需要开发⼀个函数从⼊⼝模块开始根据依赖关
系进⾏递归解析。最后将依赖关系构成为依赖图(Dependency Graph
// ES6转成ES5
const { code } = babel.transformFromAst(ast, null, {
presets: ["@babel/preset-env"],
});
const moduleInfo = { file, deps, code };
return moduleInfo;
}
const info = getModuleInfo("./src/index.js");
console.log("info:", info);

运⾏结果如下:

image-20210720144822288

2.收集依赖

上⼀步开发的函数可以单独解析某⼀个模块,这⼀步我们需要开发⼀个函数从⼊⼝模块开始根据依赖关
系进⾏递归解析。 后将依赖关系构成为依赖图(Dependency Graph)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/**
* 模块解析
* @param {*} file
* @returns
*/
function parseModules(file) {
const entry = getModuleInfo(file);
const temp = [entry];
const depsGraph = {};
getDeps(temp, entry);
temp.forEach((moduleInfo) => {
depsGraph[moduleInfo.file] = {
deps: moduleInfo.deps,
code: moduleInfo.code,
};
});
return depsGraph;
3. ⽣成bundle⽂件
这⼀步我们需要将刚才编写的执⾏函数和依赖图合成起来输出最后的打包⽂件。
最后可以编写⼀个简单的测试程序测试⼀下结果。
}
/**
* 获取依赖
* @param {*} temp
* @param {*} param1
*/
function getDeps(temp, { deps }) {
Object.keys(deps).forEach((key) => {
const child = getModuleInfo(deps[key]);
temp.push(child);
getDeps(temp, child);
});
}

3.⽣成bundle⽂件

这⼀步我们需要将刚才编写的执⾏函数和依赖图合成起来输出最后的打包⽂件。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
function bundle(file) {
const depsGraph = JSON.stringify(parseModules(file));
return `(function (graph) {
function require(file) {
function absRequire(relPath) {
return require(graph[file].deps[relPath])
}
var exports = {};
(function (require,exports,code) {
eval(code)
})(absRequire,exports,graph[file].code)
return exports
}
require('${file}')
})(${depsGraph})`;
}
!fs.existsSync("./dist") && fs.mkdirSync("./dist");
fs.writeFileSync("./dist/bundle.js", content);

最后可以编写⼀个简单的测试程序测试⼀下结果。

后⾯有兴趣的话⼤家可以在考虑⼀下如何加载css⽂件或者图⽚base64 Vue SFC .vue。

ESM vs Webpack https://juejin.cn/post/6947890290896142350
⼿写webpack原理 https://juejin.cn/post/6854573217336541192
看完这篇还搞不懂webpack https://juejin.cn/post/6844904030649614349
webpack系列之打包原理 https://blog.csdn.net/weixin_41319237/article/details/116194091

钉钉面试官分享:前端面试问什么?

原文: https://mp.weixin.qq.com/s/1c-mwTQllNAxMOqWzvKADA

专有钉钉的一面面试主要考察以下三个方面的内容:

  • 基础知识:考察面试者是否掌握扎实的前端基础知识体系

  • 业务思考:考察面试者的业务深度

  • 笔试实践:考察面试者基本知识的实践能力,并考察代码风格和逻辑思维能力

需要注意,如果你觉得某些面试题有些突兀,不要觉得惊讶,这些面试题是在不断深入沟通的过程中层层递进被带出来的,其中某些面试题可能相对较偏,纯粹是加分项面试题。

基础知识

基础知识主要包含以下几个方面:

  • 基础:计算机原理、编译原理、数据结构、算法、设计模式、编程范式等基本知识了解

  • 语法:JavaScript、ECMAScript、CSS、TypeScript、HTML、Node.js 等语法的了解和使用

  • 框架:React、Vue、Egg、Koa、Express、Webpack 等原理的了解和使用

  • 工程:编译工具、格式工具、Git、NPM、单元测试、Nginx、PM2、CI / CD 了解和使用

  • 网络:HTTP、TCP、UDP、WebSocket、Cookie、Session、跨域、缓存、协议的了解

  • 性能:编译性能、监控、白屏检测、SEO、Service Worker 等了解

  • 插件:Chrome 、Vue CLI 、Webpack 等插件设计思路的理解

  • 系统:Mac、Windows、Linux 系统配置的实践

  • 后端:Redis 缓存、数据库、Graphql、SSR、模板引擎等了解和使用

基础

  • 列举你所了解的计算机存储设备类型?

  • 一般代码存储在计算机的哪个设备中?代码在 CPU 中是如何运行的?

  • 什么是指令和指令集?

  • 复杂指令集和精简指令集有什么区别?

  • JavaScript 是如何运行的?解释型语言和编译型语言的差异是什么?

  • 简单描述一下 Babel 的编译过程?

  • JavaScript 中的数组和函数在内存中是如何存储的?

  • 浏览器和 Node.js 中的事件循环机制有什么区别?

  • ES6 Modules 相对于 CommonJS 的优势是什么?

  • 高级程序设计语言是如何编译成机器语言的?

  • 编译器一般由哪几个阶段组成?数据类型检查一般在什么阶段进行?

  • 编译过程中虚拟机的作用是什么?

  • 什么是中间代码(IR),它的作用是什么?

  • 什么是交叉编译?

  • 发布 / 订阅模式和观察者模式的区别是什么?

  • 装饰器模式一般会在什么场合使用?

  • 谈谈你对大型项目的代码解耦设计理解?什么是 Ioc?一般 DI 采用什么设计模式实现?

  • 列举你所了解的编程范式?

  • 什么是面向切面(AOP)的编程?

  • 什么是函数式编程?什么是响应式编程?什么是函数响应式编程?

  • 响应式编程或者函数响应式编程的使用场景有哪些?

语法

  • 如何实现一个上中下三行布局,顶部和底部最小高度是 100px,中间自适应?

  • 如何判断一个元素 CSS 样式溢出,从而可以选择性的加 title 或者 Tooltip?

  • 如何让 CSS 元素左侧自动溢出(… 溢出在左侧)?

  • 什么是沙箱?浏览器的沙箱有什么作用?

  • 如何处理浏览器中表单项的密码自动填充问题?

  • Hash 和 History 路由的区别和优缺点?

  • JavaScript 中的 const 数组可以进行 push 操作吗?为什么?

  • JavaScript 中对象的属性描述符有哪些?分别有什么作用?

  • JavaScript 中 console 有哪些 api ?

  • 简单对比一下 Callback、Promise、Generator、Async 几个异步 API 的优劣?

  • Object.defineProperty 有哪几个参数?各自都有什么作用?

  • Object.defineProperty 和 ES6 的 Proxy 有什么区别?

  • ES6 中 Symbol、Map、Decorator 的使用场景有哪些?或者你在哪些库的源码里见过这些 API 的使用?

  • 为什么要使用 TypeScript ? TypeScript 相对于 JavaScript 的优势是什么?

  • TypeScript 中 const 和 readonly 的区别?枚举和常量枚举的区别?接口和类型别名的区别?

  • TypeScript 中 any 类型的作用是什么?

  • TypeScript 中 any、never、unknown 和 void 有什么区别?

  • TypeScript 中 interface 可以给 Function / Array / Class(Indexable)做声明吗?

  • TypeScript 中可以使用 String、Number、Boolean、Symbol、Object 等给类型做声明吗?

  • TypeScript 中的 this 和 JavaScript 中的 this 有什么差异?

  • TypeScript 中使用 Unions 时有哪些注意事项?

  • TypeScript 如何设计 Class 的声明?

  • TypeScript 中如何联合枚举类型的 Key?

  • TypeScript 中 ?.、??、!.、_、** 等符号的含义?

  • TypeScript 中预定义的有条件类型有哪些?

  • 简单介绍一下 TypeScript 模块的加载机制?

  • 简单聊聊你对 TypeScript 类型兼容性的理解?抗变、双变、协变和逆变的简单理解?

  • TypeScript 中对象展开会有什么副作用吗?

  • TypeScript 中 interface、type、enum 声明有作用域的功能吗?

  • TypeScript 中同名的 interface 或者同名的 interface 和 class 可以合并吗?

  • 如何使 TypeScript 项目引入并识别编译为 JavaScript 的 npm 库包?

  • TypeScript 的 tsconfig.json 中有哪些配置项信息?

  • TypeScript 中如何设置模块导入的路径别名?

框架

  • React Class 组件有哪些周期函数?分别有什么作用?

  • React Class 组件中请求可以在 componentWillMount 中发起吗?为什么?

  • React Class 组件和 React Hook 的区别有哪些?

  • React 中高阶函数和自定义 Hook 的优缺点?

  • 简要说明 React Hook 中 useState 和 useEffect 的运行原理?

  • React 如何发现重渲染、什么原因容易造成重渲染、如何避免重渲染?

  • React Hook 中 useEffect 有哪些参数,如何检测数组依赖项的变化?

  • React 的 useEffect 是如何监听数组依赖项的变化的?

  • React Hook 和闭包有什么关联关系?

  • React 中 useState 是如何做数据初始化的?

  • 列举你常用的 React 性能优化技巧?

  • Vue 2.x 模板中的指令是如何解析实现的?

  • 简要说明 Vue 2.x 的全链路运作机制?

  • 简单介绍一下 Element UI 的框架设计?

  • 如何理解 Vue 是一个渐进式框架?

  • Vue 里实现跨组件通信的方式有哪些?

  • Vue 中响应式数据是如何做到对某个对象的深层次属性的监听的?

  • MVVM、MVC 和 MVP 的区别是什么?各自有什么应用场景?、

  • 什么是 MVVM 框架?

工程

  • Vue CLI 3.x 有哪些功能?Vue CLI 3.x 的插件系统了解?

  • Vue CLI 3.x 中的 Webpack 是如何组装处理的?

  • Vue 2.x 如何支持 TypeScript 语法?

  • 如何配置环境使得 JavaScript 项目可以支持 TypeScript 语法?

  • 如何对 TypeScript 进行 Lint 校验?ESLint 和 TSLint 有什么区别?

  • Node.js 如何支持 TypeScript 语法?

  • TypeScript 如何自动生成库包的声明文件?

  • Babel 对于 TypeScript 的支持有哪些限制?

  • Webpack 中 Loader 和 Plugin 的区别是什么?

  • 在 Webpack 中是如何做到支持类似于 JSX 语法的 Sourcemap 定位?

  • 发布 Npm 包如何指定引入地址?

  • 如何发布开发项目的特定文件夹为 Npm 包的根目录?

  • 如何发布一个支持 Tree Shaking 机制的 Npm 包?

  • Npm 包中 peerDependencies 的作用是什么?

  • 如何优雅的调试需要发布的 Npm 包?

  • 在设计一些库包时如何生成版本日志?

  • 了解 Git (Submodule)子模块吗?简单介绍一下 Git 子模块的作用?

  • Git 如何修改已经提交的 Commit 信息?

  • Git 如何撤销 Commit 并保存之前的修改?

  • Git 如何 ignore 被 commit 过的文件?

  • 在使用 Git 的时候如何规范 Git 的提交说明(Commit 信息)?

  • 简述符合 Angular 规范的提交说明的结构组成?

  • Commit 信息如何和 Github Issues 关联?

  • Git Hook 在项目中哪些作用?

  • Git Hook 中客户端和服务端钩子各自用于什么作用?

  • Git Hook 中常用的钩子有哪些?

  • pre-commit 和 commit-msg 钩子的区别是什么?各自可用于做什么?

  • husky 以及 ghook 等工具制作 Git Hook 的原理是什么?

  • 如何设计一个通用的 Git Hook ?

  • Git Hook 可以采用 Node 脚本进行设计吗?如何做到?

  • 如何确保别人上传的代码没有 Lint 错误?如何确保代码构建没有 Lint 错误?

  • 如何在 Vs Code 中进行 Lint 校验提示?如何在 Vs Code 中进行 Lint 保存格式化?

  • ESLint 和 Prettier 的区别是什么?两者在一起工作时会产生问题吗?

  • 如何有效的识别 ESLint 和 Prettier 可能产生冲突的格式规则?如何解决此类规则冲突问题?

  • 在通常的脚手架项目中进行热更新(hot module replacement)时如何做到 ESLint 实时打印校验错误信息?

  • 谈谈你对 SourceMap 的了解?

  • 如何调试 Node.js 代码?如何调试 Node.js TypeScript 代码?在浏览器中如何调试 Node.js 代码?

  • 列举你知道的所有构建工具并说说这些工具的优缺点?这些构建工具在不同的场景下应该如何选型?

  • VS Code 配置中的用户和工作区有什么区别?

  • VS Code 的插件可以只对当前项目生效吗?

  • 你所知道的测试有哪些测试类型?

  • 你所知道的测试框架有哪些?

  • 什么是 e2e 测试?有哪些 e2e 的测试框架?

  • 假设现在有一个插入排序算法,如何对该算法进行单元测试?

网络

  • CDN 服务如何实现网络加速?

  • WebSocket 使用的是 TCP 还是 UDP 协议?

  • 什么是单工、半双工和全双工通信?

  • 简单描述 HTTP 协议发送一个带域名的 URL 请求的协议传输过程?(DNS、TCP、IP、链路)

  • 什么是正向代理?什么是反向代理?

  • Cookie 可以在服务端生成吗?Cookie 在服务端生成后的工作流程是什么样的?

  • Session、Cookie 的区别和关联?如何进行临时性和永久性的 Session 存储?

  • 设置 Cookie 时候如何防止 XSS 攻击?

  • 简单描述一下用户免登陆的实现过程?可能会出现哪些安全性问题?一般如何对用户登录的密码进行加密?

  • HTTP 中提升传输速率的方式有哪些?常用的内容编码方式有哪些?

  • 传输图片的过程中如果突然中断,如何在恢复后从之前的中断中恢复传输?

  • 什么是代理?什么是网关?代理和网关的作用是什么?

  • HTTPS 相比 HTTP 为什么更加安全可靠?

  • 什么是对称密钥(共享密钥)加密?什么是非对称密钥(公开密钥)加密?哪个更加安全?

  • 你觉得 HTTP 协议目前存在哪些缺点?

性能

  • 在 React 中如何识别一个表单项里的表单做到了最小粒度 / 代价的渲染?

  • 在 React 的开发的过程中你能想到哪些控制渲染成本的方法?

插件

  • Vue CLI 3.x 的插件系统是如何设计的?

  • Webpack 中的插件机制是如何设计的?

系统

  • \r\n(CRLF) 和 \n (LF)的区别是什么?(Vs Code 的右下角可以切换)

  • /dev/null 的作用是啥?

  • 如何在 Mac 的终端中设置一个命令的别名?

  • 如何在 Windows 中设置环境变量?

  • Mac 的文件操作系统默认区分文件路径的大小写吗?

  • 编写 Shell 脚本时如何设置文件的绝对路径?

后端

  • Session、Cookie 的区别和关联?如何进行临时性和永久性的 Session 存储?

  • 如何部署 Node.js 应用?如何处理负载均衡中 Session 的一致性问题?

  • 如何提升 Node.js 代码的运行稳定性?

  • GraphQL 与 Restful 的区别,它有什么优点?

  • Vue SSR 的工作原理?Vuex 的数据如何同构渲染?

  • SSR 技术和 SPA 技术的各自的优缺点是什么?

  • 如何处理 Node.js 渲染 HTML 压力过大问题?

业务思考

业务思考更多的是结合基础知识的广度和深度进行的具体业务实践,主要包含以下几个方面:

  • 工程化:代码部署、CI / CD 流程设计、Jenkins、Gitlab、Docker 等

  • 通用性:脚手架、SDK、组件库等框架设计

  • 应用框架:Hybrid 混合、微前端、BFF、Monorepo

  • 可视化:

  • 低代码:通用表单设计、通用布局设计、通用页面设计、JSON Schema 协议设计等

  • 测试:E2E 测试、单元测试、测试覆盖率、测试报告等

  • 业务:数据、体验、复杂度、监控

工程化

  • 你所知道的 CI / CD 工具有哪些?在项目中有接触过类似的流程吗?

  • 如果让你实现一个 Web 前端的 CI / CD 工程研发平台,你会如何设计?

  • 如果我们需要将已有项目中的线上产物资源(例如图片)转换成本地私有化资源,你有什么解决方案?

  • 如何使用 Vue CLI 3.x 定制一个脚手架?比如内部自动集成了 i18n、 axios、Element UI、路由守卫等?

  • Jenkins 如何配合 Node.js 脚本进行 CI / CD 设计?

通用性

  • 如果让你设计一个通用的项目脚手架,你会如何设计?一个通用的脚手架一般需要具备哪些能力?

  • 如果让你设计一个通用的工具库,你会如何设计?一个通用的工具库一般需要具备哪些能力?

  • 假设你自己实现的 React 或 Vue 的组件库要设计演示文档,你会如何设计?设计的文档需要实现哪些功能?

  • 在设计工具库包的时候你是如何设计 API 文档的?

应用框架

  • 谈谈 Electron、Nw.js、CEF、Flutter 和原生开发的理解?

  • 谈谈桌面端应用中 HotFix 的理解?

  • 你觉得什么样的场景需要使用微前端框架?

业务

  • 什么是单点登录?如何做单点登录?

  • 如何做一个项目的国际化方案?

  • 如何做一个项目的监控和埋点方案?

  • 如何建设项目的稳定性(监控、灰度、错误降级、回滚…)?

  • 一般管理后台型的应用需要考虑哪些性能方面的优化?

  • 简述一些提升项目体验的案例和技术方案(骨架屏、Loading 处理、缓存、错误降级、请求重试…)?

  • 假设需要对页面设计一个水印方案,你会如何设计?

低代码

  • 如何设计一个通用的 JSON Schema 协议使其可以动态渲染一个通用的联动表单?

  • 一般的低代码平台需要具备哪些能力?

笔试实践

笔试更多的是考验应聘者的逻辑思维能力和代码书写风格,主要包含以下几个方面:

  • 正则表达式

  • 算法

  • 数据结构

  • 设计模式

  • 框架的部分原理实现

  • TypeScript 语法

  • 模板解析

数据结构

  • 使用 TypeScript 语法将没有层级的扁平数据转换成树形结构的数据

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    `// 扁平数据  
    [{
      name: '文本1',
      parent: null,
      id: 1,
    }, {
      name: '文本2',
      id: 2,
      parent: 1
    }, {
      name: '文本3',
      parent: 2,
      id: 3,
    }]

    // 树状数据
    [{
      name: '文本1',
      id: 1,
      children: [{
        name: '文本2',
        id: 2,
        children: [{
          name: '文本3',
          id: 3
        }]
      }]
    }]`

模板解析

  • 实现一个简易的模板引擎

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    const template = '嗨,{{ info.name.value }}您好,今天是星期 {{ day.value }}';

    const data = {
    info: {
    name: {
    value: '张三'
    }
    },
    day: {
    value: '三'
    }
    };

    render(template, data); // 嗨,张三您好,今天是星期三

设计模式

  • 简单实现一个发布 / 订阅模式

正则表达式

  • 匹配出字符串中 const a = require(‘xxx’) 中的 xxx

参考资料

子弈: https://juejin.cn/user/3227821870163176/posts

专有钉钉校园招聘指南: https://juejin.cn/post/6933141572238671880

在阿里我是如何当面试官的: https://juejin.cn/post/6844904093425598471

LaTeX - 在Markdown中插入数学公式

在Markdown中插入数学公式的语法是 $数学公式$$$数学公式$$.

行内公式

$2x+3y=34$

因此,在排版数学公式时,即使没有特殊符号的算式如1+1=2,或者简单的一个字母变量x,也要进入数学模式,使用$1+1=2$,$x$, 而不是使用排版普通文字的方式. $2x+3y=34$,$1+1=2$, $x$

独立公式

独立公式单独占一行,不和其他文字混编
示例: $$c=2πr$$

多行公式

在独立公式中使用\\来换行

1
2
3
4
$$   
2x+3y=34\\
x+4y=25
$$

$$
2x+3y=34\
x+4y=25
$$

常用符号

image-20220501033315739

上下标

$S=a_{1}^2+a_{2}^2+a_{3}^2$
$$
S=a_{1}^2+a_{2}^2+a_{3}^2
$$

括号

$f(x, y) = 100 * \lbrace[(x + y) * 3] - 5\rbrace$
$$
f(x, y) = 100 * \lbrace[(x + y) * 3] - 5\rbrace \
f(x, y) = 100 * {[(x + y) * 3] - 5}
$$

分数

$\frac{1}{3} 与 \cfrac{1}{3}$
$$
\frac{1}{3} \ 与 \ \cfrac{1}{3}
$$

开方

$\sqrt[3]{X}$$\sqrt{5 - x}$
$$
\sqrt[3]{X} \ 和 \ \sqrt{5 - x}
$$

其他字符

关系运算符

$$
\pm \ \times \ \div \ \mid \ \nmid \ \cdot \ \circ \ \ast \ \bigodot \ \bigotimes \ \bigoplus \ \leq \ \geq \ \neq \ \approx \ \equiv \ \sum \ \prod
$$

image-20220501033406229

对数运算符

$$
\log \ \lg \ \ln
$$

image-20220501033444782

三角运算符

$$
\bot \ \angle \ \sin \ \cos \ \tan \ \cot \ \sec \ \csc
$$

image-20220501033513958

微积分运算符

$$
\prime \ \int \ \iint \ \iiint \ \oint \ \lim \ \infty \ \nabla \ \mathrm{d}
$$

image-20220501033539620

集合运算符

$$
\emptyset \ \in \ \notin \ \subset \ \subseteq \ \supseteq \ \bigcap \ \bigcup \ \bigvee \ \bigwedge \ \biguplus \bigsqcup
$$

image-20220501033612481

希腊字母

$$
A \ \alpha \ B \ \beta \ \Gamma \ \gamma \ \
\Delta \ \delta \ E \ \epsilon \ Z \ \zeta \ \
H \ \eta \ \Theta \ \theta \ I \ \iota \ \
K \ \kappa \ \Lambda \ \lambda \ M \ \mu \ N \ \nu \ \
Xi \ \xi \ O \ \omicron \ \Pi \ \pi \ \
P \ \rho \ \Sigma \ \sigma \ T \ \tau \ \
\Upsilon \ \upsilon \ \Phi \ \phi \ \
X \ \chi \ \Psi \ \psi \ \Omega \ \omega
$$

image-20220501034204219

Deploy Your Next JS Application in Subfolder

You sometimes want to deploy your Next JS under /blog, /docs, or /dashboard folder. But by default, you can only deploy your Next JS on your project root folder.

Since Next JS 9.5, you can change this configuration by setting basePath in your next.config.js. By default, basePath is set to / but you can modify to /blog or /docs:

1
module.exports = {  basePath: '/docs',};

That also means you can also run several Next JS applications under the same domain. You don’t need to create a subdomain anymore.

After updating basePath, you won’t be able to visit http://localhost:3000 on your local development environment. To continue to browse on your local machine, you need to add basePath value after http://localhost:3000.

For your information, you don’t need to update your links in your Next JS code. Indeed, it’ll automatically prepend basePath value to all your links.

Source: https://creativedesignsguru.com/deploy-your-next-js-application-in-subfolder/

Frontend Study Roadmap

#1 From https://roadmap.sh/

For React Developer

Everything that is there to learn about React and the ecosystem in 2021. https://roadmap.sh/react

The list below is exhaustive, and the items are listed in no particular order. You don’t need to learn everything listed in the picture, however knowing what you don’t know is as important as knowing things.

react-roadmap

For Front-end Developer

Step by step guide to becoming a modern front-end developer. https://roadmap.sh/frontend

#2 From https://frontendmasters.com/

The Front-End Developer Learning Roadmap

link: https://frontendmasters.com/guides/learning-roadmap/

roadmap-2

Front-End handbook: https://frontendmasters.com/books/front-end-handbook/2019

#3 From https://www.ladybug.dev/

Web Developer Learning Path

This is all subjective (different skills are more difficult for some people than for other) and there isn’t “one way” to learn web development! These are just some of the skills you might want to learn! https://www.ladybug.dev/episodes/web-developer-learning-path

Find the picture here: https://twitter.com/LadybugPodcast/status/1247051343212281856

EU5qV4-XYAA3DBR

使用Graphzic与dot语言绘图

一、Graphviz简介

Graphviz是一个可以在Linux、macOS、Windows和Solaris系统上使用的开源图表可视化工具,它能使你具备用文本画图的能力,画出来的图形可导出为图片、SVG、PDF等格式。使用Graphviz之前需要先安装,具体步骤参考官方文档:https://graphviz.gitlab.io/download/。提示:Mac系统使用`brew install graphviz`来安装,Windows系统则下载exe安装包进行安装,并且需要配置系统环境变量:

image-20210622162833910

安装配置完成后,通过dot -version命令查看是否安装成功,成功会输出版本号等信息:

image-20210622163216158

二、用命令行生成图片

现在,你可以在电脑上创建一个demo.dot文本,写入图形代码:

1
2
3
4
5
graph simple
{
a -- b -- c;
b -- d;
}

在终端/命令行进入该.dot文件所在目录,通过命令把它转换为需要输出的格式:

1
dot demo.dot –Tpng –o demo.png

恭喜你!成功画了一个图!打开demo.png,你将会看到这个图:

如果要输出svg格式:

1
dot demo.dot –Tsvg –o demo.svg

如果你使用VSCode编程,建议安装 Graphviz (dot) language support for Visual Studio Code 和 Graphviz Preview 插件:

二、使用dot语言画图

Demo1

graph方法是使用没有箭头的线连接节点,你还可以更改图形方向、形状及文字格式,用Graphviz画思维导图:

1
2
3
4
5
6
7
8
9
10
11
12
graph g {
rankdir=LR //方向左右
dot语言->{简介,语法,示例}
dot语言[shape=box,fontcolor=red]
简介[color=red]
语法[color=green]
示例[color=blue]
简介->{开源免费,UML绘图,导出svg}
语法->{"digraph","graph"}
"digraph"->导向图[label=可以制作带方向的导图]
"graph"->无向图[label=可以制作不带方向的导图]
}

image-20210622164656933

Demo2

使用“digraph”画有箭头的线:

1
digraph { a -> b }

image-20210622165058213

Demo3

改变节点的形状和连接线样式:

1
2
3
4
5
6
7
8
9
10
11
digraph D {

A [shape=diamond]
B [shape=box]
C [shape=circle]

A -> B [style=dashed, color=grey]
A -> C [color="black:invis:black"]
A -> D [penwidth=5, arrowhead=none]

}

image-20210622165243138

Demo4

可以用来画流程图:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
digraph finite_state_machine {
rankdir=LR;
size="8,5"

node [shape = doublecircle]; S;
node [shape = point ]; qi

node [shape = circle];
qi -> S;
S -> q1 [ label = "next" ];
S -> S [ label = "a" ];
q1 -> S [ label = "haha" ];
q1 -> q2 [ label = "next" ];
q2 -> q1 [ label = "b" ];
q2 -> q2 [ label = "b" ];
q2 -> q3 [ label = "next" ];
q3 -> q4 [ label = "next" ];
}

image-20210622165724800

Demo5-1

一棵简单的二叉树:

1
2
3
4
5
digraph D {
A -> {B,C}
B -> {D,E}
C -> {F,G}
}

image-20210622170029325

Demo5-2

别忘了节点形状是可以改变的:

1
2
3
4
digraph D {
node [shape = box];
A -> {B, C, D} -> {F}
}

image-20210622171626854

更多形状:

image-20210622204633195

Demo6

用html画表格:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
digraph html{
mytable[shape = none, margin = 0, label = <
<TABLE BORDER = "0" CELLBORDER = "1" CELLSPACING = "0" CELLPADDING = "4">
<TR><TD ROWSPAN = "3"><FONT COLOR = "red">hello</FONT><BR/>world</TD>
<TD COLSPAN = "3">b</TD>
<TD ROWSPAN = "3">g</TD>
<TD ROWSPAN = "3">h</TD>
</TR>
<TR><TD>c</TD>
<TD PORT = "here">d</TD>
<TD>e</TD>
</TR>
<TR><TD COLSPAN = "3">f</TD>
</TR>
</TABLE>
>]
}

image-20210622171048859

Demo7

使用rank把节点排列成对齐的行(列):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
digraph G{
rankdir = LR
{
node[shape = plaintext]
1995 -> 1996 -> 1997 -> 1998 -> 1999 -> 2000 -> 2001
}
{
node[shape = box, style = filled]
WAR3 -> Xhero -> Footman -> DOTA
WAR3 -> Battleship
}
{
{rank = same 1996 WAR3}
{rank = same 1998 Xhero Battleship}
{rank = same 1999 Footman}
{rank = same 2001 DOTA}
}
}

image-20210622171357857

Demo8

用控制符”n”,”ne”,”e”,”se”,”s”,”sw”,”w”和”nw”指定连接线的起止位置:

1
2
3
4
5
6
7
8
9
10
11
digraph G{
node[shape = box]
c:n -> d[label = n]
c1:ne -> d1[label = ne]
c2:e -> d2[label = e]
c3:se -> d3[label = se]
c4:s -> d4[label = s]
c5:sw -> d5[label = sw]
c6:w -> d6[label = w]
c7:nw -> d7[label = nw]
}

Demo9

画流程图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
digraph startgame {
label="游戏资源更新流程"
rankdir="TB"
start[label="启动游戏" shape=circle style=filled]
ifwifi[label="网络环境判断是否 WIFI" shape=diamond]
needupdate[label="是否有资源需要更新" shape=diamond]
startslientdl[label="静默下载" shape=box]
enterhall[label="进入游戏大厅" shape=box]

enterroom[label="进入房间" shape=box]
resourceuptodate[label="资源不完整" shape=diamond]
startplay[label="正常游戏" shape=circle fillcolor=blue]
warning[label="提醒玩家是否更新" shape=diamond]
startdl[label="进入下载界面" shape=box]
//{rank=same; needupdate, enterhall}

{shape=diamond; ifwifi, needupdate}

start -> ifwifi
ifwifi->needupdate[label="是"]
ifwifi->enterhall[label="否"]
needupdate->startslientdl[label="是"]
startslientdl->enterhall
needupdate->enterhall[label="否"]

enterhall -> enterroom
enterroom -> resourceuptodate
resourceuptodate -> warning[label="是"]
resourceuptodate -> startplay[label="否"]
warning -> startdl[label="确认下载"]
warning -> enterhall[label="取消下载"]
startdl -> enterhall[label="取消下载"]
startdl -> startplay[label="下载完成"]
}

image-20210622175352386

如果要近一步学会使用Graphviz画图,你一定要浏览这个repo:https://github.com/huangz1990/redisbook1e-gallery

参考资料:

Using Throttling and Debouncing with React hooks

https://dev.to/pulkitnagpal/using-throttling-and-debouncing-with-react-hooks-57f1

Throttling and debouncing techniques has been in use for past many years in javascript.
In this post I’d like to share my knowledge on how we can use throttle and debounce functions with help of react hooks.

Consider below example with two routes / and /count rendering respective components.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
export default function App() {
return (
<BrowserRouter>
<div>
<nav>
<ul>
<li>
<Link to="/">Home</Link>
</li>
<li>
<Link to="/count">Count</Link>
</li>
</ul>
</nav>
<Switch>
<Route path="/count">
<Count />
</Route>
<Route path="/">
<Home />
</Route>
</Switch>
</div>
</BrowserRouter>
);
}

Throttling Example with useEffect

Suppose we need to subscribe a scroll event on Count component on its mount and just increment the count on every scroll event.

Code without using throttle or debounce techniques will be like:

1
2
3
4
5
6
7
8
9
10
11
function Count() {
const [count, setCount] = useState(1);
useEffect(() => {
window.addEventListener('scroll', increaseCount);
return () => window.removeEventListener('scroll', increaseCount);
}, []);
const increaseCount = () => {
setCount(count => count + 1);
}
return <h2 style={{marginBottom: 1200}}>Count {count}</h2>;
}

Suppose in practical applications you need to use throttle and wait for every 100ms before we execute increaseCount. I have used the lodash throttle function for this example.

1
2
3
4
5
6
7
8
9
10
11
function Count() {
const [count, setCount] = useState(1);
useEffect(() => {
window.addEventListener('scroll', _.throttle(increaseCount, 100));
return () => window.removeEventListener('scroll', _.throttle(increaseCount, 100));
}, []);
const increaseCount = () => {
setCount(count => count + 1);
}
return <h2 style={{marginBottom: 1200}}>Count {count}</h2>;
}

Wait, no need to hurry. It will work if you are at /count route. The increaseCount function will be throttled and will increase the count after 100ms of intervals.

But as you move to the / route to render the Home component and unmount the Count component, and start scrolling on home page, you will notice a warning in console which warns about memory leak. This is probably because the scroll event was not cleaned properly.
The reason is _.throttle(increaseCount, 100) is called again during unmount and returns another function which does not match that created during the mount stage.
What if we create a variable and store the throttled instance.

like this

1
2
3
4
5
const throttledCount = _.throttle(increaseCount, 100);
useEffect(() => {
window.addEventListener('scroll', throttledCount);
return () => window.removeEventListener('scroll', throttledCount);
}, []);

But it has problem too. The throttledCount is created on every render, which is not at all required. This function should be initiated once which is possible inside the useEffect hook. As it will now be computed only once during mount.

1
2
3
4
5
useEffect(() => {
const throttledCount = _.throttle(increaseCount, 100);
window.addEventListener('scroll', throttledCount);
return () => window.removeEventListener('scroll', throttledCount);
}, []);

Debounce Example using useCallback or useRef

Above example is pretty simple. Let’s look at another example where there is an input field and you need to increment the count only after user stops typing for certain time. And there is text which is updated on every keystroke which re renders the component on every input.

Code with debounce:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
function Count() {
const [count, setCount] = useState(1);
const [text, setText] = useState("");
const increaseCount = () => {
setCount(count => count + 1);
}
const debouncedCount = _.debounce(increaseCount, 1000);
const handleChange = (e) => {
setText(e.target.value);
debouncedCount();
}
return <>
<h2>Count {count}</h2>
<h3>Text {text}</h3>
<input type="text" onChange={handleChange}></input>
</>;
}

This will not work. The count will increase for every keystroke. The reason behind is that on every render, a new debouncedCount is created.
We have to store this debounced function such that it is initiated only once like that in useEffect in above example.
Here comes use of useCallback.
useCallback will return a memoized version of the callback that only changes if one of the dependencies has changed - React docs
Replace

1
const debouncedCount = _.debounce(increaseCount, 1000);

with

1
const debouncedCount = useCallback(_.debounce(increaseCount, 1000),[]);

and it will work. Because this time the function is evaluated only once at the initial phase.

Or we can also use useRef
by doing this

1
const debouncedCount = useRef(debounce(increaseCount, 1000)).current;

One should always keep in mind that every render call of react functional component will lead to expiration of local variables and re-initiation unless you memoize them using hooks.

How does UmiJS stack up against Next.js?

comparing-react-ssr-frameworks-umi-vs-next

UmiJS is an extensible, enterprise-level React framework authored by Alipay’s developer team. Alipay uses it in its internal projects, as do several other companies such as Youku and Netease.

While exploring this framework, I discovered that it’s similar to Next.js in a handful of interesting ways. Both have support for routing and server-side rendering out of the box as well as TypeScript.

Along the way, I got curious about Umi and decided to look deeper into the framework to see how it compares with Next. I evaluated both frameworks based on the criteria listed below. Here are my findings.

CSS support

Next has support for all CSS styling methods including CSS in JS, Sass, Stylus, Less, CSS module and Post CSS. You can just import the css file into your pages in case of regular CSS:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// styles.css
body {
font-family: 'SF Pro Text', 'SF Pro Icons', 'Helvetica Neue', 'Helvetica',
'Arial', sans-serif;
padding: 20px 20px 60px;
max-width: 680px;
margin: 0 auto;
}

// pages/_app.js
import '../styles.css'

// This default export is required in a new `pages/_app.js` file.
export default function MyApp({ Component, pageProps }) {
return <Component {...pageProps} />
}

Next has official plugins for writing CSS using Sass, Stylus, and Less. If you’re using CSS modules, you’ll need to follow Next’s naming convention, [name].module.css.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Button.module.css
/*
You do not need to worry about .error {} colliding with any other `.css` or
`.module.css` files!
*/
.error {
color: white;
background-color: red;
}

// Button.js
import styles from './Button.module.css'

export function Button() {
return (
<button
type="button"
// Note how the "error" class is accessed as a property on the imported
// `styles` object.
className={styles.error}
>
Destroy
</button>
)
}

Umi, on the other hand, has dropped support for Sass and currently supports regular CSS, CSS module, and Less. If you want to use Sass or Stylus, you’ll need to configure the webpack config to do so. Umi automatically recognizes the use of CSS modules.

1
2
3
4
5
// Example of CSS Modules
import styles from './foo.css';

// Example of Non-CSS Modules
import './foo.css';

webpack customization

Next features such as code splitting, hot code reloading, and server-side rendering already work out of the box. But if you need extra power or just a different configuration, Next allows you to write your own configuration through its next.config.js module. The config file is a regular Node.js module instead of a JSON file.

1
2
3
module.exports = {
/* config options here */
}

Umi also has its own configuration file, but it’s in the form of JSON file.

1
2
3
4
5
6
7
8
export default {
base: '/docs/',
publicPath: '/static/',
hash: true,
history: {
type: 'hash',
},
}

Documentation

I found Next’s documentation to be more detailed in explaining how to use each feature.

To show how each feature works, the docs walk you through building a simple blog app.

image-20210325142751370

Another thing to consider: part of Umi’s documentation is not yet translated into English (Umi’s main user base is located in China). I had to use Google Translate feature to help me read the documentation.

image-20210325142834404

CLI support

Umi has some interesting CLI support to generate pages and check the current webpack configuration.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Usage: umi <command> [options]

Commands:

build build application for production
config umi config cli
dev start a dev server for development
generate generate code snippets quickly
help show command helps
plugin inspect umi plugins
version show umi version
webpack inspect webpack configurations
dva
test test with jest

Run `umi help <command>` for more information of specific commands.
Visit https://umijs.org/ to learn more about Umi.

Next’s CLI support is focused solely on helping you to deploy the application.

1
2
3
4
5
6
7
8
9
10
11
12
Usage
$ next <command>

Available commands
build, start, export, dev, telemetry

Options
--version, -v Version number
--help, -h Displays this message

For more information run a command with the --help flag
$ next build --help

Plugin system

Umi’s internal functions are all third-party plugins. The documentation covers how its plugin system works, complete with a test framework.

Next has its own set of plugins, but I can’t seem to find instructions on how to create one and share it with other developers.

Why Next has the edge

Both Next and Umi fully support building React applications for production with little to no configuration. Next has more complete support for writing CSS and customizing its webpack configuration, while Umi is more opinionated and doesn’t give much support for webpack configurations.

For now, I prefer Next to Umi because I find the Umi documentation a bit hard to understand. I also found more guides for building things with Next, such as e-commerce websites and static sites.

Original link: How does UmiJS stack up against Next.js?

微前端探索

转自张泰峰的博客《微前端大赏》

一、https://www.cnblogs.com/ztfjs/p/single-spa.html

二、https://www.cnblogs.com/ztfjs/p/single-spa2.html

三、https://www.cnblogs.com/ztfjs/p/qiankun.html

微前端大赏

什么是“微”

什么是微前端?微前端解决了什么问题?要回答这两个问题,我们首先要解决的是:什么是“微”。大家可能已经听说过微服务的概念, 微服务是后端服务的一种架构模式,它想解决的问题是可用性问题、扩展性问题、耦合度问题,进而演变出“服务治理”,”服务发现”等技术。例如:

  • 通过熔断、限流等机制保证高可用
  • 微服务之间调用的负载均衡
  • 分布式事务(2PC、3PC、TCC、LCN等)
  • 服务调用链跟踪
  • 配置中心
  • 服务自动发现

“微”的基础能力:

单一职责

一个微服务应该都是单一职责的,这才是“微”的体现,一个微服务解决一个业务问题(注意是一个业务问题而不是一个接口)。

面向服务

将自己的业务能力封装并对外提供服务,这是继承SOA的核心思想,一个微服务本身也可能使用到其它微服务的能力

这两个基础的能力构成了微服务整个的架构体系,是围绕服务、围绕一个个单一的职责体系的,它将一个、多个不同业务体系内的服务连接起来合并成一个大的业务模块,再分而治之,对每个服务做相应的技术、业务处理,合并成了一整个面向服务的业务。当服务发生故障,熔断机制产生作用,兜底服务马上启用,然后调用告警服务,将信息通知给通知服务,接着通知服务负责提醒对应的人员查看并解决问题。
这一系列的操作就是微服务的“微”字所要解决的问题,它把传统的大型项目拆分成各个不同的业务模块,再由各种一致性组件、可用性组件把它们组合起来使用。

“微”是分治的意思,那微前端是什么呢?

前端的历史

后端jsp时代

JSP时代没有太多的悬念,我依稀还记得那个年代,当我clone下后端爸爸的代码,笨拙的在windows电脑上按照csdn的步骤安装java的jdk,打开百度搜索“JRE和JDK的区别是什么,我有没有装错”…一言难尽。总之那个年代,我们前端的代码大多数必须经过后端同学在jsp里面的标签处理才可以在线上使用这个时候拆分、分治的工作都集中在js,会分为很多套不同的js代码,在script中依次引入操作的。
这个阶段前端其实并不“微”,只是作为一个界面脚本标记存在的而已。

iframe时代

渐渐的ajax、jQuery、require.js的出现打破了前端生态的模式,ajax使前后端分离,jQuery使前端变得更加容易编写,而AMD的模块规范以及require.js的出现让前端从此变得不一样了,前端进入了模块化的时代。

require.js是遵循AMD协议的一个前端模块化库。

最早的时候,所有Javascript代码都写在一个文件里面,只要加载这一个文件就够了。后来,代码越来越多,一个文件不够了,必须分成多个文件,依次加载。下面的网页代码,相信很多人都见过。

1
2
3
<script src="1.js"></script>
<script src="2.js"></script>
<script src="3.js"></script>

requireJS的写法:

模块代码:

1
2
3
4
5
6
7
// main.js

require(['moduleA', 'moduleB', 'moduleC',function (moduleA, moduleB, moduleC){

 // some code here

});

requireJS可以通过我们现在熟悉的request(),类似的写法去引入一个模块,在这个时候,它的理念跟iframe相结合,就有了第一个“微前端”的架构模式,当然这个时候的微前端并不很“微”。

通过一张阿里云的控制台的图来解释这套架构的模式:

avatar

主应用负责框架、通信、路由、资源分配。
子应用负责实现业务。
两者之间通过一套特定的sdk进行交互。

已经非常接近微服务的整体概念了。 通过主框架解决共性问题,拆分各个不同的微模块、微应用解决各个单一职责的问题,这个时候每个应用是面向应用的,即应用本身只对应用本身负责,它有很多特性:

1、技术栈无关,遵循同一套通信机制即可
2、应用解偶,团队之间通过主框架基座进行交互
3、热更新插拔,不需要全部更新主框架,只需要更新对应的应用即可
4、可动态降级熔断

……

可以说这一时期的前端已经进入了微前端时代。当然不是所有的应用都适用于这一个庞大的开发模式,毕竟阿里云几十上百个不同的应用模块是需要庞大的业务支撑的。

打包技术与SSR(服务端渲染)

然后gulp、webpack出现了,angular、vue、react单页应用也出现了。

但问题来了,我们知道一个单页应用里资源是很重的。首页的加载速度需要很大的代价去优化它。这个时候iframe会带来比较严重的体验问题。

Single-spa出现了

Single-spa是一个用于前端微服务化的JavaScript前端解决方案。

同样的技术栈无关,在同一个页面中使用多种技术框架(React, Vue, AngularJS, Angular, Ember等任意技术框架),并且不需要刷新页面。
也同样无需重构现有代码,使用新的技术框架编写代码,现有项目中的代码无需重构。
更好的资源控制,每个独立模块的代码可做到按需加载,不浪费额外资源。
每个独立模块可独立运行。大致是这样的:
avatar

让我们再去盗几张别人的图:(图片来自网络,侵权通删)

image-20210225105459886

Loader

Loader是核心模块的加载器,可以通过loader来进行子应用的加载,目前的微前端方案设计里面一般有两种模式。

第一种是非侵入式(iframe模式),通过加载对应子应用的 index.html 文件,再通过对首页html文件进行解析,获取到子应用的js文件和css文件,进行加载。

另一种是子应用打包成一个js文件,按照规范的导出格式,主应用只加载 index.js 文件。获取到对应的render和destroy方法。

External

在SPA微前端中有一个需要解决的问题就是,子应用间的公共依赖,我们如何抽离项目间的公共依赖呢,由于我们将一个应用拆分成了多个子应用,那子应用之间的依赖如何复用。如果了解commonJS的同学应该知道,commonJS具备加载模块缓存能力,加载过的模块会将其缓存起来,那么是不是我们可以将子模块以commonJS的规范进行打包。在加载子模块时,提供全局的exports和require方法,将子应用导出的exports进行收集,在require时加载我们配置的external资源。

核心问题

* 通信 *

消息总线,简单理解就是一个消息收发中心,众多应用可以连接到总线上,应用可以往消息中心发送或接收信息(通过订阅监听或主动推拉)。比如:应用A发送一条消息到总线上,总线判断应该送给应用B,应用B可以接收到信息(应用B订阅或拉取到了应用A的消息),这样的话,消息总线就充当一个中间者的角色,使得应用A和应用B解偶了,很方便。

在前端可使用的技术大致有:

1、通过window交互,需要注意的是domain域名的设置,比较复杂,维护成本高,不可控性高。

2、通过socket,主应用和子应用连接socket,通过服务端实现通信,一般没有人这么用,比较复杂, 成本高。

3、通过url进行简单的交互,大多应用采用的是由路由参数进行交互的,实现简单且体验较好。

4、localstorage等存储媒介。

鉴权问题

微前端怎样在各个模块之间统一权限体系?这个问题前端解决的难度不低,玩的不好容易崩溃。
一般情况下由后台爸爸,通过cookie识别,从后台接口带出对应的权限数据在前端进行二次判断。

污染问题

1、全局环境污染
2、事件污染
3、style污染
4、定时器污染
5、localstorage污染

解决全局环境污染和style污染,通常采用快照模式代理劫持,在新的api中还可以采用shadowbox

Sandbox

有一个核心的模块是沙盒,由于多个子应用会反复的展示在同一个容器内,子应用中会造成对当前环境的副作用,例如:全局样式、全局变量、监听事件、定时器等。沙盒在这里主要是为运行中的程序提供隔离环境,避免应用之间相互影响。在应用的运行环境中做资源隔离,监听应用的生命周期进行清理、加载操作。

小结:什么是微前端

微前端(Micro-Frontends)是一种类似于微服务的架构,它将微服务的理念应用于浏览器端,即:

将 Web 应用由单一的单体应用转变为多个小型前端应用聚合为一的应用。各个前端应用还可以独立运行、独立开发、独立部署。微前端不是单纯的前端框架或者工具,而是一套架构体系

这个概念最早在2016年底被提出,可以参考在Google上搜索Micro-Frontends, 排名靠前的https://micro-frontends.org的博客文章,提出了早期的微前端模型。

微前端能做什么?

1、拆分和细化
2、整合历史系统
3、独立构建发布
4、治理、熔断、降级
……

相关资源

前端微服务化解决方案2 - Single-SPA:https://www.jianshu.com/p/c0f4b837dbea
前端必看的微前端:https://zhuanlan.zhihu.com/p/162726399
微前端-最容易看懂的微前端知识:https://zhuanlan.zhihu.com/p/141530392

下期我们可以具体实践实践,自己动手搭建一个基于single-spa的微前端框架,敬请期待。

微前端大赏二:single-spa实践

single-spa

single-spa是一个javascript库,它可以让很多小页面、小组件、不同架构的前端组件在一个页面应用程序中共存。

这里有一个演示: (https://single-spa.surge.sh)

这个库可以让你的应用使用多个不同的技术栈(vue、react、angular等等),这样我们就可以做同步开发,最后再使用一个公用的路由即可实现路由完美切换。也可以使用一样的技术栈,分不同的团队进行开发,只需要最后使用这个库把它们整合在一起,设置不用的路由名称就可以了。

优点:

  • 敏捷

    独立开发和更快的部署周期: 开发团队可以选择自己的技术并及时更新技术栈。 一旦完成其中一项就可以部署,而不必等待所有事情完毕。

  • 风险下降

    降低错误和回归问题的风险,相互之间的依赖性急剧下降。

  • 更小单元

    更简单快捷的测试,每一个小的变化不必再触碰整个应用程序。

  • 持续交付

    更快交付客户价值,有助于持续集成、持续部署以及持续交付。

缺点:

  • 配置复杂

    single-spa相对来说配置复杂,当然我们还有更简单一点的qiankun,也可以基于single-spa封装一套更适合自己的框架。

  • 一定的资源浪费

    由于核心逻辑还是在于请求manifest,拿到js文件后执行渲染,这个过程不可避免会产生一些冗余,对于C端的应用来说,这个问题比较致命,当然,对于B端来说,这个是可以接受的,在可控制的范围之内

single-spa核心逻辑

几张图可以解决single-spa的核心逻辑

avatar

第一张图,很显然,第一步,在我们的webpack应用里生成一个manifest.json文件,这个文件内容差不多如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
"files": {
"static/js/0.chunk.js": "/static/js/0.chunk.js",
"static/js/0.chunk.js.map": "/static/js/0.chunk.js.map",
"static/js/1.chunk.js": "/static/js/1.chunk.js",
"static/js/1.chunk.js.map": "/static/js/1.chunk.js.map",
"main.js": "/static/js/main.chunk.js",
"main.js.map": "/static/js/main.chunk.js.map",
"runtime-main.js": "/static/js/bundle.js",
"runtime-main.js.map": "/static/js/bundle.js.map",
"index.html": "/index.html",
"static/media/logo.svg": "/static/media/logo.103b5fa1.svg"
},
"entrypoints": [
"static/js/bundle.js",
"static/js/0.chunk.js",
"static/js/main.chunk.js"
]
}

关键点在 entrypoints 这个属性,我们可以通过manifest拿到项目的依赖表并可以使用script标签动态加载出来,这个时候我们就可以实现动态加载不同的微前端应用了。

image-20210225105932045

第二张图,我画出了更加具体的,single-spa在渲染过程中的核心逻辑:

1、 首先我们有 main(主app)和 child(子app),主app只有一个,子app可以有多个

2、 其次,主app上一般我们可以在index.html里面,写多几个空间,也就是多几个div

例如:

1
2
<div id=”react-app”></div>
<div id=”vue-app”></div>

3、然后,在我们的child上,要用webpack插件,生成一个带有所有需要加载的依赖文件的manifest.json

4、主应用去加载manifest.json,获取到具体的js,使用script标签把它放到主应用上,进行渲染

至此,我们就可以完全搞清楚,为什么single-spa这么神奇了,接下来让我们搭建一个简易版的single-spa。

搭建single-spa

vue main

由于我们需要使用webpack配置,而最新版本的vue-cli默认只有babel,我们用这个步骤来安装一个vue版本的主应用

1、装包

1
npm install @vue/cli @vue/cli-init  -g

2、创建一个项目

1
vue init webpack demo-single

3、进入目录

1
cd demo-single

4、装包

1
npm i single-spa single-spa-vue axios --save

5、在src目录创建一个single-spa配置文件 single-spa-config.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// single-spa-config.js
import * as singleSpa from 'single-spa'; //导入single-spa
import axios from 'axios'

/*
runScript:
一个promise同步方法。可以代替创建一个script标签,然后加载服务
*/
const runScript = async (url) => {
return new Promise((resolve, reject) => {
const script = document.createElement('script');
script.src = url;
script.onload = resolve;
script.onerror = reject;
const firstScript = document.getElementsByTagName('script')[0];
firstScript.parentNode.insertBefore(script, firstScript);
});
};

const getManifest = (url, bundle) => new Promise(async (resolve) => {
const { data } = await axios.get(url);
// eslint-disable-next-line no-console
const { entrypoints } = data;

for (let i = 0; i < entrypoints.length; i++) {
await runScript('http://127.0.0.1:3000/' + entrypoints[i]).then(() => {
if (i === entrypoints.length - 1) {
resolve()
}
})
}
});

singleSpa.registerApplication( //注册微前端服务
'singleDemoVue', async () => {
let singleVue = null;
await getManifest('http://127.0.0.1:3000/asset-manifest.json').then(() => {
singleVue = window.singleReact;
});
return singleVue;
},
location => location.pathname.startsWith('/react') // 配置前缀
);

singleSpa.start(); // 启动

注: 可以看到,runScript就是个创建script标签的方法,getManifest是一个简单的获取manifest并创建script的方法

6、在main.js里引入这个文件

1
import './single-spa-config'

7、运行

1
npm run dev

最终得到这样一个工程

avatar

这样我们就完成了一个入口的配置,当然它还很简单,更复杂的操作我们应该放在具体的工程上去做。

react child

上面的代码可以看到,我们register了一个vue主应用并且访问了它的manifest文件,现在我们需要创建一个react子应用,也是直接通过几个步骤来完成,我们使用create-react-app来快速搭建:

1、装包

1
npm install create-react-app -g

2、创建

1
npx create-react-app my-app

3、创建完成后,注意我们需要对webpack做一点修改,默认create-react-app会有一个git本地分支,让我们先提交到本地仓库

1
2
3
git status
git add .
git commit -m ttt

4、拿到webpack配置文件,create-react-app默认隐藏了webpack配置文件

1
yarn eject 或 npm run eject

5、修改webpack文件
修改 /config/webpack.config.js 在output增加:

1
2
3
4
5
output: {
...这里忽略了原有的
library: 'singleReact',
libraryTarget: 'window'
}

avatar

修改 /scripts/start.js文件,在 const devServer = new ... 这个地方,增加一个header的设置:

1
2
3
4
5
6
7
8
const devServer = new WebpackDevServer(compiler, {
...serverConfig,
// 这里上增加的header设置
headers: {
'Access-Control-Allow-Origin': '*',
}

});

avatar

6、修改src/index.js,要把root改为动态渲染,还要注册生命周期

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import React from 'react';
import ReactDOM from 'react-dom';
import './index.css';
import App from './App';
import reportWebVitals from './reportWebVitals';
import single-spaReact, {single-spaContext} from 'single-spa-react';

const rootComponent = () => {
ReactDOM.render(
<React.StrictMode>
<App />
</React.StrictMode>
,
document.getElementById('react-root')
);
}

// ReactDOM.render(
// ,
// document.getElementById('root')
// );


const reactLifecycles = single-spaReact({
React,
ReactDOM,
rootComponent,
errorBoundary(err, info, props) {
// https://reactjs.org/docs/error-boundaries.html
console.error(err)
return (
<div>This renders when a catastrophic error occurs</div>
);
},
});
export const bootstrap = reactLifecycles.bootstrap;
export const mount = reactLifecycles.mount;
export const unmount = reactLifecycles.unmount;


// If you want to start measuring performance in your app, pass a function
// to log results (for example: reportWebVitals(console.log))
// or send to an analytics endpoint。Learn more: https://bit.ly/CRA-vitals
reportWebVitals();

7、运行

1
npm run start

8、在main的vue那里,访问/react 你会看到下面有一个react渲染和vue的一起出现,大功告成

生命周期

生命周期函数共有4个:bootstrap、mount、unmount、update。生命周期可以传入,返回Promise的函数也可以传入返回Promise函数的数组。
引用一个大佬完整的说明, 非常的详细:https://github.com/YataoZhang/my-single-spa/issues/4

结论

single-spa可以给我们提供一整套方案,去搭建微前端集成框架,但它并不是一个开箱即用的封装,它有很多的坑等着我们去踩。
一般情况下,我们选择使用qiankun,它的封装程度更好,api更加友好一些。待积攒足够多的使用经验,可以考虑自研一套自己的微前端框架,增加整体的前端研发效率。下节我将给大家带来qiankun对single-spa的封装,在具体应用中的实践。待完结框架篇后,我们可以再深入探究single-spa的实现原理以及各种概念。

参考文章

single-spa 文档: https://single-spa.js.org/docs/getting-started-overview/

微前端 single-spa: https://juejin.cn/post/6844903896884707342

这可能是你见过最完善的微前端解决方案!: https://www.infoq.cn/article/o6GxRD9iHQOplKICiDDU

single-spa微前端: http://www.soulapp.tech/2019/09/25/single-spa微前端/

Single-Spa + Vue Cli 微前端落地指南 (项目隔离远程加载,自动引入) : https://juejin.cn/post/6844904025565954055

微前端终篇:qiankun指南以及微前端整体探索

qiankun原理和API介绍

qiankun是基于single-spa框架的一个上层应用,它提供了完整的生命周期,和一些钩子函数,通过路由匹配来动态加载注册微应用,同时提供了一系列api对微应用做管理和预加载等,它相对single-spa来说进步是比较大的。

所以—qiankun实质上是single-spa的一个封装,基于我们在上一节看到的,single-spa是通过输出一个manifest.json 通过标识入口信息动态构造script渲染实现的微前端应用,类似下面的图:

image-20210225112744682

回顾一下single-spa在渲染过程中的核心逻辑
1、 首先我们有 main(主app) child(子app),主app只有一个,子app可以有多个
2、 其次,主app上一般我们可以在index.html里面,写多几个空间,也就是多几个div

例如:

1
2
<div id\=”react-app”\></div\>
<div id\=”vue-app”\></div\>

3、然后,在我们的child上,要用webpack插件,生成一个带有所有需要加载的依赖文件的manifest.json

4、主应用去加载这个manifest.json,获取到具体的js,使用script标签把它放到主应用上,进行渲染

在qiankun中对这套逻辑做了基本的封装, 让我们只需要经过简单的几个api就可以控制single-spa中比较复杂的配置和概念。

注册

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import { registerMicroApps, start } from 'qiankun';
registerMicroApps([
{
name: 'react app', // 应用名称
entry: '//localhost:7100', // 应用入口,应用需要增加cors选项
container: '#yourContainer', // 应用单独的appid的div
activeRule: '/yourActiveRule', // 匹配路由
},
{
name: 'vue app',
entry: { scripts: ['//localhost:7100/main.js'] },
container: '#yourContainer2',
activeRule: '/yourActiveRule2',
},
]);
start();

main

main是一个qiankun的主体部分,它也是不限制框架种类的,可以用react也可以用vue和angular,只需要在entry.js里面注册它就可以了。

一般情况下main的作用是存放公共代码,例如:
1、消息触发器
2、公共路由
3、权限触发器
4、存放例如全局管理、皮肤、用户管理等公共页面

你也可以把站点的首页写在这里,可以加快主体加载速度

生命周期

bootstrap

boostrap相当于init,子应用在第一次加载的时候会调用这个方法, 一般可以在里面做一些项目的初始化操作。

mount

每次在加载到子应用的时候都会调用它,就像是componentDidMount,一般情况下我们要把ReactDOM.render这样的初始化函数写在里面,每次mount时调用render。

unmount

这个跟mount正好相反,每一次注销/切换子应用的时候会调用它,一般我们在这里 ReactDOM.unmountComponentAtNode 注销这个应用,然后把整个项目的容器让出来

update

这是个可选的生命周期,子应用发生变化的时候会调用。

路由匹配

路由规则有两种,需要手动调用对应的子应用渲染就行了,通过一个叫loadMicroApp的方法挂载一个子应用组件,这样就可以在main中像配置一个正常的应用那样配置子应用的view了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import { loadMicroApp } from 'qiankun';
import React from 'react';
class App extends React.Component {
containerRef = React.createRef();
microApp = null;
componentDidMount() {
this.microApp = loadMicroApp(
{ name: 'app1', entry: '//localhost:1234', container: this.containerRef.current, props: { name: 'qiankun' } },
);
}
componentWillUnmount() {
this.microApp.unmount();
}
componentDidUpdate() {
this.microApp.update({ name: 'kuitos' });
}
render() {
return <div ref={this.containerRef}></div>;
}
}

处理样式

沙箱

qiankun的沙箱模式是在start的api配置项里面开启的。

sandbox 选项可选

1
2
3
start({
sandbox: true // true | false | { strictStyleIsolation?: boolean, experimentalStyleIsolation?: boolean }
})

默认情况下沙箱可以确保单实例场景子应用之间的样式隔离,但是无法确保主应用跟子应用、或者多实例场景的子应用样式隔离。当配置为 { strictStyleIsolation: true } 时表示开启严格的样式隔离模式。这种模式下 qiankun 会为每个微应用的容器包裹上一个 shadow dom 节点,从而确保微应用的样式不会对全局造成影响。
**shadow dom coco大神写过一篇文章介绍:https://www.cnblogs.com/coco1s/p/5711795.html

样式冲突解决方案

qiankun 会自动隔离微应用之间的样式(开启沙箱的情况下),你可以通过手动的方式确保主应用与微应用之间的样式隔离。比如给主应用的所有样式添加一个前缀,或者假如你使用了 ant-design 这样的组件库,你可以通过这篇文档中的配置方式给主应用样式自动添加指定的前缀。

以 antd 为例:

配置 webpack 修改 less 变量

1
2
3
4
5
6
7
8
9
{
loader: 'less-loader',
+ options: {
+ modifyVars: {
+ '@ant-prefix': 'yourPrefix',
+ },
+ javascriptEnabled: true,
+ },
}

配置 antd ConfigProvider

1
2
3
4
5
6
import { ConfigProvider } from 'antd';
export const MyApp = () => (
<ConfigProvider prefixCls="yourPrefix">
<App />
</ConfigProvider>
);

webpack配置的问题

微应用的打包工具还需要增加如下配置:

1
2
3
4
5
6
7
8
const packageName = require('./package.json').name;
module.exports = {
output: {
library: `${packageName}-[name]`,
libraryTarget: 'umd',
jsonpFunction: `webpackJsonp_${packageName}`,
},
};

qiankun实践 - react微前端应用

起始,准备2个react应用,直接用create-react-app创建两个app应用,可以得到一个文件夹里有两个项目。

1
2
npx create-react-app main-app
npx create-react-app micro-app

我们用main做主应用,micro做子应用,按照我们的api,子应用只需要配置一个register就可以引入子应用。
其中子应用需要调出webpack配置,create-react-app默认是不允许手动配置的,使用命令就可以了
进入micro-app的文件夹目录运行(create-react-app也有overload的办法更改配置,这里为了方便直接用命令调出来):

1
npm run eject

这样项目的准备工作就做好了。

子应用配置

配置子应用两个步骤,一个是生命周期的配置。 我们把生命周期函数写好放到main.js中:

image-20210225113127374

然后把reactDom.render放到mount生命周期里调用,让qiankun在准备好加载mount的时候再去初始化应用:

image-20210225113208656

unmount的注销操作也不能忘记:

image-20210225113250407

我们更改一下子应用的根节点id,在父应用中再去引用它(不要忘了html里也需要更改):

image-20210225113335581

最后再把webpack中的配置修改一下:
1、修改devserver支持cors 修改端口
headers: { 'Access-Control-Allow-Origin': '*', }

image-20210225113406029

image-20210225113433309
2、修改增加bundle的导出,在webpack.config.js增加配置:

image-20210225113504237

父应用配置

然后我们就可以去在main应用中,注册了首先要

1
npm install qiankun --save

然后在main文件index.js中注册子应用:

image-20210225113612030

别忘了我们还需要在public/index.html中写一个div容器,id是我们子应用的那个id,用来承载子应用的渲染:

image-20210225113650792

然后我们就可以开始运行看一看了:

image-20210225113731843

运行成功,随便改一下micro的样式看看效果:

image-20210225113803662

接下来我们需要处理一下路由跳转的问题。

路由的处理实践

前文有提到,在react中使用qiankun可以使用apiloadMicroApp,这里我们也用它来处理路由的跳转。
我们主要是在main-app中操作:
首先新建micro-app的view文件(每多一个子应用就新建一个):

image-20210225113839377

然后使用react-router直接配置:
由于create-react-app默认没有直接提供react-router,我们手动下一个

1
npm install react-router react-router-dom --save

改完index.js长这样:

image-20210225113942064

再试一下:

image-20210225114020401

大功告成!

结论和源码

相比较上一次我们看见的 single-spa的配置要简单了很多,而且更加直白,新增子应用更加无缝。
需要demo源码的同学私信我哦

应用场景和坑:静态资源问题解决

微应用打包之后 css 中的字体文件和图片加载如果使用的加载路径是相对路径,会导致css 中的字体文件和图片加载 404。

而 css 文件一旦打包完成,就无法通过动态修改 publicPath 来修正其中的字体文件和背景图片的路径。

主要有三个解决方案:

  • 所有图片等静态资源上传至 cdn,css 中直接引用 cdn 地址(推荐)
  • 借助 webpack 的 url-loader 将字体文件和图片打包成 base64(适用于字体文件和图片体积小的项目)(推荐)
  • 使用绝对地址,nginx中设置静态目录

结束语

qiankun整体的思路是比较ok的,它大大简化了single-spa的使用逻辑,让微前端的门槛变得更低,但它仍然有一些缺点,例如部分api总是会有莫名其妙的问题、api文档不是特别直观等,这些都是待改进的地方。而对于微前端来说,做到能够技术栈无关、渐进升级旧项目、分离不同业务等功能就已经能发挥它的最大价值了。

Youtube-ltc下载油管和B站视频

Youtube-lt是一个用Python开发的命令行下载工具,支持下载youtube、Bilibili等网站的视频。

安装

UNIX (Linux, macOS, etc.)
使用 wget:

1
2
sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl

使用 curl:

1
2
sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl

Windows 用户可以通过下载 youtube-dlc.exe 进行安装 (不要 安装在 C:\Windows\System32!).

使用

使用方式非常简单,在终端/命令行执行:

1
2
3
youtube-dl [OPTIONS] URL [URL...]
// 示例(使用代理)
youtube-dl https://www.youtube.com/watch?v=5giYv5n616E --proxy 127.0.0.1:19180

OPTIONS

youtube-dlc功能强大,提供了非常多的设置选项,我们可以通过options设置各种参数。

  • 下载多个地址

    希望从其他网站下载多个视频,如果是这样,用空格分隔视频网址 $ youtube-dl <url1> <url2>

    将链接全部放在文本文件中,并将其作为参数传递给Youtube-dl,$ youtube-dl -a url.txt,此命令将下载url.txt文件中提到的所有视频。

  • 设置代理

    通过--proxy URL代理选项,支持设置http或sock代理:

    1
    youtube-dl --proxy socks5://127.0.0.1:1080 url

    如果要将代理用于所有其他调用,请创建一个配置文件

    Linux / OSX:〜/ .config / youtube-dl / config

    Windows:%APPDATA%\ youtube-dl \ config.txt

    文件内容示例:

    1
    --proxy socks5://127.0.0.1:1080
  • 下载指定格式视频

    -F 查看所有视频格式,下载指定质量的视频和音频并自动合并: youtube-dl -f [format code] [url]

    1
    2
    3
    youtube-dl -f 247+251 https://www.youtube.com/watch?v=UyJ8Qbh_LH0
    # or
    youtube-dl -f bestvideo+bestaudio https://www.youtube.com/watch?v=UyJ8Qbh_LH0

    YouTube 的 1080p 及以上的分辨率都是音视频分离的,需要分别下载视频和音频,命令可以使用 247+251 (或 bestvideo+bestaudio)这样的组合,如果系统中安装了 FFmpeg 的话 , youtube-dl 会自动合并下下好的视频和音频, 然后自动删除单独的音视频文件:通过免输入的批处理脚本和 shell 脚本下载YouTube高清视频,参考《使用 Youtube-dl 下载 Youtube 1080P+ 视频》:https://chengww.com/archives/youtube_download.html#What-is-it

    PS:FFmpeg也是一个非常强大的命令行工具,可以记录、转换和传输音频/视频。Mac安装FFmpeg视频教程:https://youtu.be/8nbuqYw2OCw

    FFmpeg usage: ffmpeg [options] [[infile options] -i infile]… {[outfile options] outfile}…

    1
    2
    3
    4
    brew install ffmpeg
    clear
    ffmpeg -h
    ffmpeg -i input.avi output.mp4
  • 下载视频中的音频

    Youtube-dl允许我们仅从Youtube视频下载音频,默认情况下,Youtube-dl将以Ogg(opus)格式下载音频。如果想下载任何其他格式,例如mp3,以下命令将从给定视频下载音频,将其转换为MP3并将其保存在当前目录中:

    1
    youtube-dl -x --audio-format mp3 https://www.youtube.com/watch?v=7E-cwdnsiow

    PS:转换音频格式需要安装ffmpeg或avconv。

  • 下载字幕

    下载字幕,并按顺序选择 ass/srt/best 字幕,把字幕转成 srt 格式

    1
    youtube-dl --write-sub --sub-format "ass/srt/best" --convert-subs "srt" "video_url"

    —write-sub:写入字幕,即把字幕下载。
    --sub-format:指定字幕格式,按顺序选,不存在则选下一个。
    --convert-subs: 转换字幕,格式有限制,通用为 srt ;若不转,某些字幕可能是 .vtt 的;如果有 ass 字幕可下载,则无须加此项。

    1
    youtube-dl --proxy 127.0.0.1:19180 --write-sub --sub-format "ass/srt/best" --convert-subs "srt" https://www.youtube.com/watch?v=Q63qjIXMqwU&t=2s
  • 自定义文件名称

    1
    2
    3
    4
    # 自定义视频名称
    youtube-dl -o 'some name' https://www.youtube.com/watch?v=7E-cwdnsiow
    # 下载播放列表并重命名文件
    youtube-dl -o "%(playlist_index)s.%(title)s-%(id)s.%(ext)s" PLAYLIST_URL
  • 使用登陆账户下载

    1
    youtube-dl -u USERNAME -p PASSWORD UDEMY-Course-URL

Tutorial: Learn how Gatsby works

The goal of this tutorial is to guide you through setting up and deploying your first Gatsby site. Along the way, you’ll learn some general web development topics as well as the fundamentals of building a Gatsby site.

Note: This tutorial is intended to be as accessible as possible to people without much web development experience. If you prefer to jump straight to code, feel free to skip to the quick start.

Set Up Your Development Environment

Before you start building your first Gatsby site, you’ll need to familiarize yourself with some core web technologies and make sure that you have installed all required software tools.

Familiarize yourself with the command line

The command line is a text-based interface used to run commands on your computer. You’ll also often see it referred to as the terminal. In this tutorial, we’ll use both interchangeably. It’s a lot like using the Finder on a Mac or Explorer on Windows. Finder and Explorer are examples of graphical user interfaces (GUI). The command line is a powerful, text-based way to interact with your computer.

Take a moment to locate and open up the command line interface (CLI) for your computer. Depending on which operating system you are using, see instructions for Mac, instructions for Windows or instructions for Linux.

Note: If you’re new to the command line, “running” a command, means “writing a given set of instructions in your command prompt, and hitting the Enter key”. Commands will be shown in a highlighted box, something like node --version, but not every highlighted box is a command! If something is a command it will be mentioned as something you have to run/execute.

Install Node.js for your appropriate operating system

Node.js is an environment that can run JavaScript code outside of a web browser. Gatsby is built with Node.js. To get up and running with Gatsby, you’ll need to have a recent version installed on your computer. npm comes bundled with Node.js so if you don’t have npm, chances are that you don’t have Node.js either.

Mac instructions

To install Gatsby and Node.js on a Mac, it is recommended to use Homebrew. A little set-up in the beginning can save you from some headaches later on!

How to install or verify Homebrew on your computer:

  1. Open your Terminal.
  2. See if Homebrew is installed. You should see “Homebrew” and a version number.
1
2

brew -v
  1. If not, download and install Homebrew with the instructions.
  2. Once you’ve installed Homebrew, repeat step 2 to verify.

Install Xcode Command Line Tools:

  1. Open your Terminal.
  2. Install Xcode Command line tools by running:
1
2

xcode-select --install

💡 If that fails, download it directly from Apple’s site, after signing-in with an Apple developer account.

  1. After being prompted to start the installation, you’ll be prompted again to accept a software license for the tools to download.

Install Node

  1. Open your Terminal
  2. Install node with Homebrew:
1
2

brew install node

💡 If you don’t want to install it through Homebrew, download the latest Node.js version from the official Node.js website, double click on the downloaded file and go through the installation process.

Windows Instructions

Linux Instructions

Install nvm (Node Version Manager) and needed dependencies. nvm is used to manage Node.js and all its associated versions.

💡 When installing a package, if it asks for confirmation, type y and press enter.

Select your distro:

💡 If the Linux distribution you are using is not listed here, please find instructions on the web.

Ubuntu, Debian, and other apt based distros:

  1. Make sure your Linux distribution is ready to go run an update and an upgrade:
1
2
3

sudo apt update
sudo apt -y upgrade
  1. Install curl which allows you to transfer data and download additional dependencies:
1
2

sudo apt-get install curl
  1. After it finishes installing, download the latest nvm version:
1
2

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.1/install.sh | bash
  1. Confirm this has worked. The output should be a version number.
1
2

nvm --version
  1. Continue with the section: Set default Node.js version

Arch, Manjaro and other pacman based distros:

  1. Make sure your distribution is ready to go:
1
2

sudo pacman -Syu
  1. These distros come installed with curl, so you can use that to download nvm:
1
2

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.1/install.sh | bash
  1. Before using nvm, you need to install additional dependencies by running:
1
2

sudo pacman -S grep awk tar
  1. Confirm this has worked. The output should be a version number.
1
2

nvm --version
  1. Continue with the section: Set default Node.js version

Fedora, RedHat, and other dnf based distros:

  1. These distros come installed with curl, so you can use that to download nvm:
1
2

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.1/install.sh | bash
  1. Confirm this has worked. The output should be a version number.
1
2

nvm --version
  1. Continue with the section: Set default Node.js version

Set default Node.js version

When nvm is installed, it does not default to a particular node version. You’ll need to install the version you want and give nvm instructions to use it. This example uses the version 10 release, but more recent version numbers can be used instead.

1
2
3

nvm install 10
nvm use 10

Confirm that this worked:

1
2
3

npm --version
node --version

The output should look similar to the screenshot below, showing version numbers in response to the commands.

img

Once you have followed the installation steps and you have checked everything is installed properly, you can continue to the next step.

Install Git

Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. When you install a Gatsby “starter” site, Gatsby uses Git behind the scenes to download and install the required files for your starter. You will need to have Git installed to set up your first Gatsby site.

The steps to download and install Git depend on your operating system. Follow the guide for your system:

Using the Gatsby CLI

The Gatsby CLI tool lets you quickly create new Gatsby-powered sites and run commands for developing Gatsby sites. It is a published npm package.

The Gatsby CLI is available via npm and should be installed globally by running:

1
2

npm install -g gatsby-cli

Note: when you install Gatsby and run it for the first time, you’ll see a short message notifying you about anonymous usage data that is being collected for Gatsby commands, you can read more about how that data is pulled out and used in the telemetry doc.

See the available commands:

1
2

gatsby --help

Check gatsby commands in terminal

💡 If you are unable to successfully run the Gatsby CLI due to a permissions issue, you may want to check out the npm docs on fixing permissions, or this guide.

Create a Gatsby site

Now you are ready to use the Gatsby CLI tool to create your first Gatsby site. Using the tool, you can download “starters” (partially built sites with some default configuration) to help you get moving faster on creating a certain type of site. The “Hello World” starter you’ll be using here is a starter with the bare essentials needed for a Gatsby site.

  1. Open up your terminal.
  2. Create a new site from a starter:
1
2

gatsby new hello-world https://github.com/gatsbyjs/gatsby-starter-hello-world

💡 What happened?

  • new is a gatsby command to create a new Gatsby project.
  • Here, hello-world is an arbitrary title — you could pick anything. The CLI tool will place the code for your new site in a new folder called “hello-world”.
  • Lastly, the GitHub URL specified points to a code repository that holds the starter code you want to use.

💡 Depending on your download speed, the amount of time this takes will vary. For brevity’s sake, the gif below was paused during part of the install

  1. Change into the working directory:
1
2

cd hello-world

💡 This says ‘I want to change directories (cd) to the “hello-world” subfolder’. Whenever you want to run any commands for your site, you need to be in the context for that site (aka, your terminal needs to be pointed at the directory where your site code lives).

  1. Start the development mode:
1
2

gatsby develop

💡 This command starts a development server. You will be able to see and interact with your new site in a development environment — local (on your computer, not published to the internet).

View your site locally

Open up a new tab in your browser and navigate to http://localhost:8000/

Check homepage

Congrats! This is the beginning of your very first Gatsby site! 🎉

You’ll be able to visit the site locally at http://localhost:8000/ for as long as your development server is running. That’s the process you started by running the gatsby develop command. To stop running that process (or to “stop running the development server”), go back to your terminal window, hold down the “control” key, and then hit “c” (ctrl-c). To start it again, run gatsby develop again!

Note: If you are using VM setup like vagrant and/or would like to listen on your local IP address, run gatsby develop --host=0.0.0.0. Now, the development server listens on both http://localhost and your local IP.

Set up a code editor

A code editor is a program designed specifically for editing computer code. There are many great ones out there.

Download VS Code

Gatsby documentation sometimes includes screenshots that were taken in VS Code, so if you don’t have a preferred code editor yet, using VS Code will make sure that your screen looks like the screenshots in the tutorial and docs. If you choose to use VS Code, visit the VS Code site and download the version appropriate for your platform.

Install the Prettier plugin

We also recommend using Prettier, a tool that helps format your code to avoid errors.

You can use Prettier directly in your editor using the Prettier VS Code plugin:

  1. Open the extensions view on VS Code (View => Extensions).
  2. Search for “Prettier - Code formatter”.
  3. Click “Install”. (After installation, you’ll be prompted to restart VS Code to enable the extension. Newer versions of VS Code will automatically enable the extension after download.)

💡 If you’re not using VS Code, check out the Prettier docs for install instructions or other editor integrations.

➡️ What’s Next?

To summarize, in this section you:

  • Learned about the command line and how to use it
  • Installed and learned about Node.js and the npm CLI tool, the version control system Git, and the Gatsby CLI tool
  • Generated a new Gatsby site using the Gatsby CLI tool
  • Ran the Gatsby development server and visited your site locally
  • Downloaded a code editor
  • Installed a code formatter called Prettier

Now, move on to getting to know Gatsby building blocks.

References

Overview of core technologies

It’s not necessary to be an expert with these already — if you’re not, don’t worry! You’ll pick up a lot through the course of this tutorial series. These are some of the main web technologies you’ll use when building a Gatsby site:

  • HTML: A markup language that every web browser is able to understand. It stands for HyperText Markup Language. HTML gives your web content a universal informational structure, defining things like headings, paragraphs, and more.
  • CSS: A presentational language used to style the appearance of your web content (fonts, colors, layout, etc). It stands for Cascading Style Sheets.
  • JavaScript: A programming language that helps us make the web dynamic and interactive.
  • React: A code library (built with JavaScript) for building user interfaces. It’s the framework that Gatsby uses to build pages and structure content.
  • GraphQL: A query language that allows you to pull data into your website. It’s the interface that Gatsby uses for managing site data.

What is a website?

For a comprehensive introduction to what a website is — including an intro to HTML and CSS — check out “Building your first web page”. It’s a great place to start learning about the web. For a more hands-on introduction to HTML, CSS, and JavaScript, check out the tutorials from Codecademy. React and GraphQL also have their own introductory tutorials.

Learn more about the command line

For a great introduction to using the command line, check out Codecademy’s Command Line tutorial for Mac and Linux users, and this tutorial for Windows users. Even if you are a Windows user, the first page of the Codecademy tutorial is a valuable read. It explains what the command line is, not how to interface with it.

Learn more about npm

npm is a JavaScript package manager. A package is a module of code that you can choose to include in your projects. If you downloaded and installed Node.js, npm was installed with it!

npm has three distinct components: the npm website, the npm registry, and the npm command line interface (CLI).

  • On the npm website, you can browse what JavaScript packages are available in the npm registry.
  • The npm registry is a large database of information about JavaScript packages available on npm.
  • Once you’ve identified a package you want, you can use the npm CLI to install it in your project or globally (like other CLI tools). The npm CLI is what talks to the registry — you generally only interact with the npm website or the npm CLI.

💡 Check out npm’s introduction, “What is npm?”.

Learn more about Git

You will not need to know Git to complete this tutorial, but it is a very useful tool. If you are interested in learning more about version control, Git, and GitHub, check out GitHub’s Git Handbook.

Get to Know Gatsby Building Blocks

In the previous section, you prepared your local development environment by installing the necessary software and creating your first Gatsby site using the “hello world” starter. Now, take a deeper dive into the code generated by that starter.

Using Gatsby starters

In tutorial part zero, you created a new site based on the “hello world” starter using the following command:

1
2

gatsby new hello-world https://github.com/gatsbyjs/gatsby-starter-hello-world

When creating a new Gatsby site, you can use the following command structure to create a new site based on any existing Gatsby starter:

1
2

gatsby new [SITE_DIRECTORY_NAME] [URL_OF_STARTER_GITHUB_REPO]

If you omit a URL from the end, Gatsby will automatically generate a site for you based on the default starter. For this section of the tutorial, stick with the “Hello World” site you already created in tutorial part zero. You can learn more about modifying starters in the docs.

✋ Open up the code

In your code editor, open up the code generated for your “Hello World” site and take a look at the different directories and files contained in the ‘hello-world’ directory. It should look something like this:

Hello World project in VS Code

Note: Again, the editor shown here is Visual Studio Code. If you’re using a different editor, it will look a little different.

Let’s take a look at the code that powers the homepage.

💡 If you stopped your development server after running gatsby develop in the previous section, start it up again now — time to make some changes to the hello-world site!

Familiarizing with Gatsby pages

Open up the /src directory in your code editor. Inside is a single directory: /pages.

Open the file at src/pages/index.js. The code in this file creates a component that contains a single div and some text — appropriately, “Hello world!”

✋ Make changes to the “Hello World” homepage

  1. Change the “Hello World!” text to “Hello Gatsby!” and save the file. If your windows are side-by-side, you can see that your code and content changes are reflected almost instantly in the browser after you save the file.

💡 Gatsby uses hot reloading to speed up your development process. Essentially, when you’re running a Gatsby development server, the Gatsby site files are being “watched” in the background — any time you save a file, your changes will be immediately reflected in the browser. You don’t need to hard refresh the page or restart the development server — your changes just appear.

  1. Now you can make your changes a little more visible. Try replacing the code in src/pages/index.js with the code below and save again. You’ll see changes to the text — the text color will be purple and the font size will be larger.

src/pages/index.js

1
2
3
4
5
6
Copysrc/pages/index.js: copy code to clipboard
import React from "react"

export default function Home() {
return <div style={{ color: `purple`, fontSize: `72px` }}>Hello Gatsby!</div>
}

💡 We’ll be covering more about styling in Gatsby in part two of the tutorial.

  1. Remove the font size styling, change the “Hello Gatsby!” text to a level-one header, and add a paragraph beneath the header.

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
Copysrc/pages/index.js: copy code to clipboard
import React from "react"

export default function Home() {
return (
<div style={{ color: `purple` }}>
<h1>Hello Gatsby!</h1>
<p>What a world.</p>
</div>
);
}

More changes with hot reloading

  1. Add an image. (In this case, a random image from Unsplash).

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
12
Copysrc/pages/index.js: copy code to clipboard
import React from "react"

export default function Home() {
return (
<div style={{ color: `purple` }}>
<h1>Hello Gatsby!</h1>
<p>What a world.</p>
<img src="https://source.unsplash.com/random/400x200" alt="" />
</div>
)
}

Add image

Wait… HTML in our JavaScript?

If you’re familiar with React and JSX, feel free to skip this section. If you haven’t worked with the React framework before, you may be wondering what HTML is doing in a JavaScript function. Or why we’re importing react on the first line but seemingly not using it anywhere. This hybrid “HTML-in-JS” is actually a syntax extension of JavaScript, for React, called JSX. You can follow along with this tutorial without prior experience with React, but if you’re curious, here’s a brief primer…

Consider the original contents of the src/pages/index.js file:

src/pages/index.js

1
2
3
4
5
6
Copysrc/pages/index.js: copy code to clipboard
import React from "react"

export default function Home() {
return <div>Hello world!</div>
}

In pure JavaScript, it looks more like this:

src/pages/index.js

1
2
3
4
5
6
Copysrc/pages/index.js: copy code to clipboard
import React from "react"

export default function Home() {
return React.createElement("div", null, "Hello world!")
}

Now you can spot the use of the 'react' import! But wait. You’re writing JSX, not pure HTML and JavaScript. How does the browser read that? The short answer: It doesn’t. Gatsby sites come with tooling already set up to convert your source code into something that browsers can interpret.

Building with components

The homepage you were just making edits to was created by defining a page component. What exactly is a “component”?

Broadly defined, a component is a building block for your site; It is a self-contained piece of code that describes a section of UI (user interface).

Gatsby is built on React. When we talk about using and defining components, we are really talking about React components — self-contained pieces of code (usually written with JSX) that can accept input and return React elements describing a section of UI.

One of the big mental shifts you make when starting to build with components (if you are already a developer) is that now your CSS, HTML, and JavaScript are tightly coupled and often living even within the same file.

While a seemingly simple change, this has profound implications for how you think about building websites.

Take the example of creating a custom button. In the past, you would create a CSS class (perhaps .primary-button) with your custom styles and then use it whenever you want to apply those styles. For example:

1
2

<button class="primary-button">Click me</button>

In the world of components, you instead create a PrimaryButton component with your button styles and use it throughout your site like:

1
2

<PrimaryButton>Click me</PrimaryButton>

Components become the base building blocks of your site. Instead of being limited to the building blocks the browser provides, e.g. <button />, you can easily create new building blocks that elegantly meet the needs of your projects.

✋ Using page components

Any React component defined in src/pages/*.js will automatically become a page. Let’s see this in action.

You already have a src/pages/index.js file that came with the “Hello World” starter. Let’s create an about page.

  1. Create a new file at src/pages/about.js, copy the following code into the new file, and save.

src/pages/about.js

1
2
3
4
5
6
7
8
9
10
11
Copysrc/pages/about.js: copy code to clipboard
import React from "react"

export default function About() {
return (
<div style={{ color: `teal` }}>
<h1>About Gatsby</h1>
<p>Such wow. Very React.</p>
</div>
)
}
  1. Navigate to http://localhost:8000/about/

New about page

Just by putting a React component in the src/pages/about.js file, you now have a page accessible at /about.

✋ Using sub-components

Let’s say the homepage and the about page both got quite large and you were rewriting a lot of things. You can use sub-components to break the UI into reusable pieces. Both of your pages have <h1> headers — create a component that will describe a Header.

  1. Create a new directory at src/components and a file within that directory called header.js.
  2. Add the following code to the new src/components/header.js file.

src/components/header.js

1
2
3
4
5
6
Copysrc/components/header.js: copy code to clipboard
import React from "react"

export default function Header() {
return <h1>This is a header.</h1>
}
  1. Modify the about.js file to import the Header component. Replace the h1 markup with <Header />:

src/pages/about.js

1
2
3
4
5
6
7
8
9
10
11
12
Copysrc/pages/about.js: copy code to clipboard
import React from "react"
import Header from "../components/header"

export default function About() {
return (
<div style={{ color: `teal` }}>
<Header />
<p>Such wow. Very React.</p>
</div>
)
}

Adding Header component

In the browser, the “About Gatsby” header text should now be replaced with “This is a header.” But you don’t want the “About” page to say “This is a header.” You want it to say, “About Gatsby”.

  1. Head back to src/components/header.js and make the following change:

src/components/header.js

1
2
3
4
5
6
Copysrc/components/header.js: copy code to clipboard
import React from "react"

export default function Header(props) {
return <h1>{props.headerText}</h1>
}
  1. Head back to src/pages/about.js and make the following change:

src/pages/about.js

1
2
3
4
5
6
7
8
9
10
11
12
Copysrc/pages/about.js: copy code to clipboard
import React from "react"
import Header from "../components/header"

export default function About() {
return (
<div style={{ color: `teal` }}>
<Header headerText="About Gatsby" />
<p>Such wow. Very React.</p>
</div>
)
}

Passing data to header

You should now see your “About Gatsby” header text again!

What are “props”?

Earlier, you defined React components as reusable pieces of code describing a UI. To make these reusable pieces dynamic you need to be able to supply them with different data. You do that with input called “props”. Props are (appropriately enough) properties supplied to React components.

In about.js you passed a headerText prop with the value of "About Gatsby" to the imported Header sub-component:

src/pages/about.js

1
2
Copysrc/pages/about.js: copy code to clipboard
<Header headerText="About Gatsby" />

Over in header.js, the header component expects to receive the headerText prop (because you’ve written it to expect that). So you can access it like so:

src/components/header.js

1
2
Copysrc/components/header.js: copy code to clipboard
<h1>{props.headerText}</h1>

💡 In JSX, you can embed any JavaScript expression by wrapping it with {}. This is how you can access the headerText property (or “prop!”) from the “props” object.

If you had passed another prop to your <Header /> component, like so…

src/pages/about.js

1
2
Copysrc/pages/about.js: copy code to clipboard
<Header headerText="About Gatsby" arbitraryPhrase="is arbitrary" />

…you would have been able to also access the arbitraryPhrase prop: {props.arbitraryPhrase}.

  1. To emphasize how this makes your components reusable, add an extra <Header /> component to the about page, add the following code to the src/pages/about.js file, and save.

src/pages/about.js

1
2
3
4
5
6
7
8
9
10
11
12
13
Copysrc/pages/about.js: copy code to clipboard
import React from "react"
import Header from "../components/header"

export default function About() {
return (
<div style={{ color: `teal` }}>
<Header headerText="About Gatsby" />
<Header headerText="It's pretty cool" />
<p>Such wow. Very React.</p>
</div>
)
}

Duplicate header to show reusability

And there you have it; A second header — without rewriting any code — by passing different data using props.

Using layout components

Layout components are for sections of a site that you want to share across multiple pages. For example, Gatsby sites will commonly have a layout component with a shared header and footer. Other common things to add to layouts include a sidebar and/or a navigation menu.

You’ll explore layout components in part three.

Linking between pages

You’ll often want to link between pages — Let’s look at routing in a Gatsby site.

✋ Using the <Link /> component

  1. Open the index page component (src/pages/index.js), import the <Link /> component from Gatsby, add a <Link /> component above the header, and give it a to property with the value of "/contact/" for the pathname:

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Copysrc/pages/index.js: copy code to clipboard
import React from "react"
import { Link } from "gatsby"
import Header from "../components/header"

export default function Home() {
return (
<div style={{ color: `purple` }}>
<Link to="/contact/">Contact</Link>
<Header headerText="Hello Gatsby!" />
<p>What a world.</p>
<img src="https://source.unsplash.com/random/400x200" alt="" />
</div>
)
}

When you click the new “Contact” link on the homepage, you should see…

Gatsby dev 404 page

…the Gatsby development 404 page. Why? Because you’re attempting to link to a page that doesn’t exist yet.

  1. Now you’ll have to create a page component for your new “Contact” page at src/pages/contact.js and have it link back to the homepage:

src/pages/contact.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Copysrc/pages/contact.js: copy code to clipboard
import React from "react"
import { Link } from "gatsby"
import Header from "../components/header"

export default function Contact() {
return (
<div style={{ color: `teal` }}>
<Link to="/">Home</Link>
<Header headerText="Contact" />
<p>Send us a message!</p>
</div>
)
}

After you save the file, you should see the contact page and be able to follow the link to the homepage.

The Gatsby <Link /> component is for linking between pages within your site. For external links to pages not handled by your Gatsby site, use the regular HTML <a> tag.

Deploying a Gatsby site

Gatsby is a modern site generator, which means there are no servers to set up or complicated databases to deploy. Instead, the Gatsby build command produces a directory of static HTML and JavaScript files which you can deploy to a static site hosting service.

Try using Surge for deploying your first Gatsby website. Surge is one of many “static site hosts” which makes it possible to deploy Gatsby sites.

Gatsby Cloud is another deployment option, built by the team behind Gatsby. In the next section, you’ll find instructions for deploying to Gatsby Cloud.

If you haven’t previously installed & set up Surge, open a new terminal window and install their command-line tool:

1
2
3
4
5

npm install --global surge

# Then create a (free) account with them
surge login

Next, build your site by running the following command in the terminal at the root of your site (tip: make sure you’re running this command at the root of your site, in this case in the hello-world folder, which you can do by opening a new tab in the same window you used to run gatsby develop):

1
2

gatsby build

The build should take 15-30 seconds. Once the build is finished, it’s interesting to take a look at the files that the gatsby build command just prepared to deploy.

Take a look at a list of the generated files by typing in the following terminal command into the root of your site, which will let you look at the public directory:

1
2

ls public

Then finally deploy your site by publishing the generated files to surge.sh. For newly-created surge account, you need to verify your email with surge before publishing your site (check your inbox first and verify your email).

1
2

surge public/

Note that you will have to press the enter key after you see the domain: some-name.surge.sh information on your command-line interface.

Once this finishes running, you should see in your terminal something like:

Screenshot of publishing Gatsby site with Surge

Open the web address listed on the bottom line (lowly-pain.surge.sh in this case) and you’ll see your newly published site! Great work!

Alternative: Deploying to Gatsby Cloud

Gatsby Cloud is a platform built specifically for Gatsby sites, with features like real-time previews, fast builds, and integrations with dozens of other tools. It’s the best place to build and deploy sites built with Gatsby, and you can use Gatsby Cloud free for personal projects.

To deploy your site to Gatsby Cloud, create an account on GitHub if you don’t have one. GitHub allows you to host and collaborate on code projects using Git for version control.

Create a new repository on GitHub. Since you’re importing your existing project, you’ll want a completely empty one, so don’t initialize it with README or .gitignore files.

You can tell Git where the remote (i.e. not on your computer) repository is like this:

1
2

git remote add origin [GITHUB_REPOSITORY_URL]

When you created a new Gatsby project with a starter, it automatically made an initial git commit, or a set of changes. Now, you can push your changes to the new remote location:

1
2

git push -u origin master

Now you’re ready to link this GitHub repository right to Gatsby Cloud! Check out the reference guide on Deploying to Gatsby Cloud.

➡️ What’s Next?

In this section you:

  • Learned about Gatsby starters, and how to use them to create new projects
  • Learned about JSX syntax
  • Learned about components
  • Learned about Gatsby page components and sub-components
  • Learned about React “props” and reusing React components

Now, move on to adding styles to your site!

Introduction to Styling in Gatsby

Welcome to part two of the Gatsby tutorial!

What’s in this tutorial?

In this part, you’re going to explore options for styling Gatsby websites and dive deeper into using React components for building sites.

Using global styles

Every site has some sort of global style. This includes things like the site’s typography and background colors. These styles set the overall feel of the site — much like the color and texture of a wall sets the overall feel of a room.

Creating global styles with standard CSS files

One of the most straightforward ways to add global styles to a site is using a global .css stylesheet.

✋ Create a new Gatsby site

Start by creating a new Gatsby site. It may be best (especially if you’re new to the command line) to close the terminal windows you used for part one and start a new terminal session for part two.

Open a new terminal window, create a new “hello world” Gatsby site in a directory called tutorial-part-two, and then move to this new directory:

1
2
3

gatsby new tutorial-part-two https://github.com/gatsbyjs/gatsby-starter-hello-world
cd tutorial-part-two

You now have a new Gatsby site (based on the Gatsby “hello world” starter) with the following structure:

1
2
3
4
5

├── package.json
├── src
│ └── pages
│ └── index.js

✋ Add styles to a CSS file

  1. Create a .css file in your new project:
1
2
3
4
5

cd src
mkdir styles
cd styles
touch global.css

Note: Feel free to create these directories and files using your code editor, if you’d prefer.

You should now have a structure like this:

1
2
3
4
5
6
7

├── package.json
├── src
│ └── pages
│ └── index.js
│ └── styles
│ └── global.css
  1. Define some styles in the global.css file:

src/styles/global.css

1
2
3
4
Copysrc/styles/global.css: copy code to clipboard
html {
background-color: lavenderblush;
}

Note: The placement of the example CSS file in a /src/styles/ folder is arbitrary.

✋ Include the stylesheet in gatsby-browser.js

  1. Create the gatsby-browser.js
1
2
3

cd ../..
touch gatsby-browser.js

Your project’s file structure should now look like this:

1
2
3
4
5
6
7
8

├── package.json
├── src
│ └── pages
│ └── index.js
│ └── styles
│ └── global.css
├── gatsby-browser.js

💡 What is gatsby-browser.js? Don’t worry about this too much and for now, just know that gatsby-browser.js is one of a handful of special files that Gatsby looks for and uses (if they exist). Here, the naming of the file is important. If you do want to explore more now, check out the docs.

  1. Import your recently-created stylesheet in the gatsby-browser.js file:

gatsby-browser.js

1
2
3
4
5
Copygatsby-browser.js: copy code to clipboard
import "./src/styles/global.css"

// or:
// require('./src/styles/global.css')

Note: Both CommonJS (require) and ES Module (import) syntax work here. If you’re not sure which to choose, import is usually a good default. When working with files that are only run in a Node.js environment however (like gatsby-node.js), require will need to be used.

  1. Start the development server:
1
2

gatsby develop

If you take a look at your project in the browser, you should see a lavender background applied to the “hello world” starter:

Lavender Hello World!

Tip: This part of the tutorial has focused on the quickest and most straightforward way to get started styling a Gatsby site — importing standard CSS files directly, using gatsby-browser.js. In most cases, the best way to add global styles is with a shared layout component. Check out the docs for more on that approach.

Using component-scoped CSS

So far, we’ve talked about the more traditional approach of using standard CSS stylesheets. Now, we’ll talk about various methods of modularizing CSS to tackle styling in a component-oriented way.

CSS Modules

Let’s explore CSS Modules. Quoting from the CSS Module homepage:

A CSS Module is a CSS file in which all class names and animation names are scoped locally by default.

CSS Modules are very popular because they let you write CSS normally but with a lot more safety. The tool automatically generates unique class and animation names, so you don’t have to worry about selector name collisions.

Gatsby works out of the box with CSS Modules. This approach is highly recommended for those new to building with Gatsby (and React in general).

✋ Build a new page using CSS Modules

In this section, you’ll create a new page component and style that page component using a CSS Module.

First, create a new Container component.

  1. Create a new directory at src/components and then, in this new directory, create a file named container.js and paste the following:

src/components/container.js

1
2
3
4
5
6
7
Copysrc/components/container.js: copy code to clipboard
import React from "react"
import containerStyles from "./container.module.css"

export default function Container({ children }) {
return <div className={containerStyles.container}>{children}</div>
}

You’ll notice you imported a CSS module file named container.module.css. Let’s create that file now.

  1. In the same directory (src/components), create a container.module.css file and copy/paste the following:

src/components/container.module.css

1
2
3
4
5
Copysrc/components/container.module.css: copy code to clipboard
.container {
margin: 3rem auto;
max-width: 600px;
}

You’ll notice that the file name ends with .module.css instead of the usual .css. This is how you tell Gatsby that this CSS file should be processed as a CSS module rather than plain CSS.

  1. Create a new page component by creating a file at src/pages/about-css-modules.js:

src/pages/about-css-modules.js

1
2
3
4
5
6
7
8
9
10
11
12
13
Copysrc/pages/about-css-modules.js: copy code to clipboard
import React from "react"

import Container from "../components/container"

export default function About() {
return (
<Container>
<h1>About CSS Modules</h1>
<p>CSS Modules are cool</p>
</Container>
)
}

Now, if you visit http://localhost:8000/about-css-modules/, your page should look something like this:

Page with CSS module styles

✋ Style a component using CSS Modules

In this section, you’ll create a list of people with names, avatars, and short Latin biographies. You’ll create a <User /> component and style that component using a CSS module.

  1. Create the file for the CSS at src/pages/about-css-modules.module.css.
  2. Paste the following into the new file:

src/pages/about-css-modules.module.css

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Copysrc/pages/about-css-modules.module.css: copy code to clipboard
.user {
display: flex;
align-items: center;
margin: 0 auto 12px auto;
}

.user:last-child {
margin-bottom: 0;
}

.avatar {
flex: 0 0 96px;
width: 96px;
height: 96px;
margin: 0;
}

.description {
flex: 1;
margin-left: 18px;
padding: 12px;
}

.username {
margin: 0 0 12px 0;
padding: 0;
}

.excerpt {
margin: 0;
}
  1. Import the new src/pages/about-css-modules.module.css file into the about-css-modules.js page you created earlier by editing the first few lines of the file like so:

src/pages/about-css-modules.js

1
2
3
4
5
6
Copysrc/pages/about-css-modules.js: copy code to clipboard
import React from "react"
import styles from "./about-css-modules.module.css"
import Container from "../components/container"

console.log(styles)

The console.log(styles) code will log the resulting import so you can see the result of your processed ./about-css-modules.module.css file. If you open the developer console (using e.g. Firefox or Chrome’s developer tools, often by the F12 key) in your browser, you’ll see:

Import result of CSS module in console

If you compare that to your CSS file, you’ll see that each class is now a key in the imported object pointing to a long string e.g. avatar points to src-pages----about-css-modules-module---avatar---2lRF7. These are the class names CSS Modules generates. They’re guaranteed to be unique across your site. And because you have to import them to use the classes, there’s never any question about where some CSS is being used.

  1. Create a new <User /> component inline in the about-css-modules.js page component. Modify about-css-modules.js so it looks like the following:

src/pages/about-css-modules.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Copysrc/pages/about-css-modules.js: copy code to clipboard
import React from "react"
import styles from "./about-css-modules.module.css"
import Container from "../components/container"

console.log(styles)

const User = props => (
<div className={styles.user}>
<img src={props.avatar} className={styles.avatar} alt="" />
<div className={styles.description}>
<h2 className={styles.username}>{props.username}</h2>
<p className={styles.excerpt}>{props.excerpt}</p>
</div>
</div>
)

export default function About() {
return (
<Container>
<h1>About CSS Modules</h1>
<p>CSS Modules are cool</p>
<User
username="Jane Doe"
avatar="https://s3.amazonaws.com/uifaces/faces/twitter/adellecharles/128.jpg"
excerpt="I'm Jane Doe. Lorem ipsum dolor sit amet, consectetur adipisicing elit."
/>
<User
username="Bob Smith"
avatar="https://s3.amazonaws.com/uifaces/faces/twitter/vladarbatov/128.jpg"
excerpt="I'm Bob Smith, a vertically aligned type of guy. Lorem ipsum dolor sit amet, consectetur adipisicing elit."
/>
</Container>
)
}

Tip: Generally, if you use a component in multiple places on a site, it should be in its own module file in the components directory. But, if it’s used only in one file, create it inline.

The finished page should now look like:

User list page with CSS modules

CSS-in-JS

CSS-in-JS is a component-oriented styling approach. Most generally, it is a pattern where CSS is composed inline using JavaScript.

Using CSS-in-JS with Gatsby

There are many different CSS-in-JS libraries and many of them have Gatsby plugins already. We won’t cover an example of CSS-in-JS in this initial tutorial, but we encourage you to explore what the ecosystem has to offer. There are mini-tutorials for two libraries, in particular, Emotion and Styled Components.

Suggested reading on CSS-in-JS

If you’re interested in further reading, check out Christopher “vjeux” Chedeau’s 2014 presentation that sparked this movement as well as Mark Dalgleish’s more recent post “A Unified Styling Language”.

Other CSS options

Gatsby supports almost every possible styling option (if there isn’t a plugin yet for your favorite CSS option, please contribute one!)

and more!

What’s coming next?

Now continue on to part three of the tutorial, where you’ll learn about Gatsby plugins and layout components.

Creating Nested Layout Components

Welcome to part three!

What’s in this tutorial?

In this part, you’ll learn about Gatsby plugins and creating “layout” components.

Gatsby plugins are JavaScript packages that help add functionality to a Gatsby site. Gatsby is designed to be extensible, which means plugins are able to extend and modify just about everything Gatsby does.

Layout components are for sections of your site that you want to share across multiple pages. For example, sites will commonly have a layout component with a shared header and footer. Other common things to add to layouts are a sidebar and/or navigation menu. On this page for example, the header at the top is part of gatsbyjs.com’s layout component.

Let’s dive into part three.

Using plugins

You’re probably familiar with the idea of plugins. Many software systems support adding custom plugins to add new functionality or even modify the core workings of the software. Gatsby plugins work the same way.

Community members (like you!) can contribute plugins (small amounts of JavaScript code) that others can then use when building Gatsby sites.

There are already hundreds of plugins! Explore the Gatsby Plugin Library.

Our goal with plugins is to make them straightforward to install and use. You will likely be using plugins in almost every Gatsby site you build. While working through the rest of the tutorial you’ll have many opportunities to practice installing and using plugins.

For an initial introduction to using plugins, we’ll install and implement the Gatsby plugin for Typography.js.

Typography.js is a JavaScript library which generates global base styles for your site’s typography. The library has a corresponding Gatsby plugin to streamline using it in a Gatsby site.

✋ Create a new Gatsby site

As we mentioned in part two, at this point it’s probably a good idea to close the terminal window(s) and project files from previous parts of the tutorial, to keep things clean on your desktop. Then open a new terminal window and run the following commands to create a new Gatsby site in a directory called tutorial-part-three and then move to this new directory:

1
2
3

gatsby new tutorial-part-three https://github.com/gatsbyjs/gatsby-starter-hello-world
cd tutorial-part-three

✋ Install and configure gatsby-plugin-typography

There are two main steps to using a plugin: Installing and configuring.

  1. Install the gatsby-plugin-typography npm package.
1
2

npm install gatsby-plugin-typography react-typography typography typography-theme-fairy-gates

Note: Typography.js requires a few additional packages, so those are included in the instructions. Additional requirements like this will be listed in the “install” instructions of each plugin.

  1. Edit the file gatsby-config.js at the root of your project to the following:

gatsby-config.js

1
2
3
4
5
6
7
8
9
10
11
Copygatsby-config.js: copy code to clipboard
module.exports = {
plugins: [
{
resolve: `gatsby-plugin-typography`,
options: {
pathToConfigModule: `src/utils/typography`,
},
},
],
}

The gatsby-config.js is another special file that Gatsby will automatically recognize. This is where you add plugins and other site configuration.

Check out the doc on gatsby-config.js to read more, if you wish.

  1. Typography.js needs a configuration file. Create a new directory called utils in the src directory. Then add a new file called typography.js to utils and copy the following into the file:

src/utils/typography.js

1
2
3
4
5
6
7
8
Copysrc/utils/typography.js: copy code to clipboard
import Typography from "typography"
import fairyGateTheme from "typography-theme-fairy-gates"

const typography = new Typography(fairyGateTheme)

export const { scale, rhythm, options } = typography
export default typography
  1. Start the development server.
1
2

gatsby develop

Once you load the site, if you inspect the generated HTML using the Chrome developer tools, you’ll see that the typography plugin added a <style> element to the <head> element with its generated CSS:

Developer tool panel showing `typography.js` CSS styles

✋ Make some content and style changes

Copy the following into your src/pages/index.js so you can see the effect of the CSS generated by Typography.js better.

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Copysrc/pages/index.js: copy code to clipboard
import React from "react"

export default function Home() {
return (
<div>
<h1>Hi! I'm building a fake Gatsby site as part of a tutorial!</h1>
<p>
What do I like to do? Lots of course but definitely enjoy building
websites.
</p>
</div>
)
}

Your site should now look like this:

Screenshot of site with no layout styling

Let’s make a quick improvement. Many sites have a single column of text centered in the middle of the page. To create this, add the following styles to the <div> in src/pages/index.js.

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Copysrc/pages/index.js: copy code to clipboard
import React from "react"

export default function Home() {
return (
<div style={{ margin: `3rem auto`, maxWidth: 600 }}>
<h1>Hi! I'm building a fake Gatsby site as part of a tutorial!</h1>
<p>
What do I like to do? Lots of course but definitely enjoy building
websites.
</p>
</div>
)
}

Screenshot of a Gatsby page with a centered column of text

Sweet. You’ve installed and configured your very first Gatsby plugin!

Creating layout components

Now let’s move on to learning about layout components. To get ready for this part, add a couple new pages to your project: an about page and a contact page.

src/pages/about.js

1
2
3
4
5
6
7
8
9
10
11
12
13
Copysrc/pages/about.js: copy code to clipboard
import React from "react"

export default function About() {
return (
<div>
<h1>About me</h1>
<p>
I’m good enough, I’m smart enough, and gosh darn it, people like me!
</p>
</div>
)
}

src/pages/contact.js

1
2
3
4
5
6
7
8
9
10
11
12
13
Copysrc/pages/contact.js: copy code to clipboard
import React from "react"

export default function Contact() {
return (
<div>
<h1>I'd love to talk! Email me at the address below</h1>
<p>
<a href="mailto:me@example.com">me@example.com</a>
</p>
</div>
)
}

Let’s see what the new about page looks like:

About page with uncentered text

Hmm. It would be nice if the content of the two new pages were centered like the index page. And it would be nice to have some sort of global navigation so it’s easy for visitors to find and visit each of the sub-pages.

You’ll tackle these changes by creating your first layout component.

✋ Create your first layout component

  1. Create a new directory at src/components.
  2. Create a very basic layout component at src/components/layout.js:

src/components/layout.js

1
2
3
4
5
6
7
8
9
10
Copysrc/components/layout.js: copy code to clipboard
import React from "react"

export default function Layout({ children }) {
return (
<div style={{ margin: `3rem auto`, maxWidth: 650, padding: `0 1rem` }}>
{children}
</div>
)
}
  1. Import this new layout component into your src/pages/index.js page component:

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Copysrc/pages/index.js: copy code to clipboard
import React from "react"
import Layout from "../components/layout"

export default function Home() {
return (
<Layout>
<h1>Hi! I'm building a fake Gatsby site as part of a tutorial!</h1>
<p>
What do I like to do? Lots of course but definitely enjoy building
websites.
</p>
</Layout>
);
}

Screenshot of a Gatsby page with a centered column of text

Sweet, the layout is working! The content of your index page is still centered.

But try navigating to /about/, or /contact/. The content on those pages still won’t be centered.

  1. Import the layout component in about.js and contact.js (as you did for index.js in the previous step).

The content of all three of your pages is centered thanks to this single shared layout component!

✋ Add a site title

  1. Add the following line to your new layout component:

src/components/layout.js

1
2
3
4
5
6
7
8
9
10
11
Copysrc/components/layout.js: copy code to clipboard
import React from "react"

export default function Layout({ children }) {
return (
<div style={{ margin: `3rem auto`, maxWidth: 650, padding: `0 1rem` }}>
<h3>MySweetSite</h3>
{children}
</div>
)
}

If you go to any of your three pages, you’ll see the same title added, e.g. the /about/ page:

Formatted page showing site title

  1. Copy the following into your layout component file:

src/components/layout.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Copysrc/components/layout.js: copy code to clipboard
import React from "react"
import { Link } from "gatsby"
const ListLink = props => (
<li style={{ display: `inline-block`, marginRight: `1rem` }}>
<Link to={props.to}>{props.children}</Link>
</li>
)

export default function Layout({ children }) {
return (
<div style={{ margin: `3rem auto`, maxWidth: 650, padding: `0 1rem` }}>
<header style={{ marginBottom: `1.5rem` }}>
<Link to="/" style={{ textShadow: `none`, backgroundImage: `none` }}>
<h3 style={{ display: `inline` }}>MySweetSite</h3>
</Link>
<ul style={{ listStyle: `none`, float: `right` }}>
<ListLink to="/">Home</ListLink>
<ListLink to="/about/">About</ListLink>
<ListLink to="/contact/">Contact</ListLink>
</ul>
</header>
{children}
</div>
)
}

A Gatsby page showing navigation links

And there you have it! A three page site with basic global navigation.

Challenge: With your new “layout component” powers, try adding headers, footers, global navigation, sidebars, etc. to your Gatsby sites!

What’s coming next?

Continue on to part four of the tutorial where you’ll start learning about Gatsby’s data layer and programmatically creating pages!

Data in Gatsby

Welcome to Part Four of the tutorial! Halfway through! Hope things are starting to feel pretty comfortable 😀

Recap of the first half of the tutorial

So far, you’ve been learning how to use React.js—how powerful it is to be able to create your own components to act as custom building blocks for websites.

You’ve also explored styling components using CSS Modules.

What’s in this tutorial?

In the next four parts of the tutorial (including this one), you’ll be diving into the Gatsby data layer, which is a powerful feature of Gatsby that lets you build sites from Markdown, WordPress, headless CMSs, and other data sources of all flavors.

NOTE: Gatsby’s data layer is powered by GraphQL. For an in-depth tutorial on GraphQL, we recommend How to GraphQL.

Data in Gatsby

A website has four parts: HTML, CSS, JS, and data. The first half of the tutorial focused on the first three. Now let’s learn how to use data in Gatsby sites.

What is data?

A very computer science-y answer would be: data is things like "strings", integers (42), objects ({ pizza: true }), etc.

For the purpose of working in Gatsby, however, a more useful answer is “everything that lives outside a React component”.

So far, you’ve been writing text and adding images directly in components. Which is an excellent way to build many websites. But, often you want to store data outside components and then bring the data into the component as needed.

If you’re building a site with WordPress (so other contributors have a nice interface for adding & maintaining content) and Gatsby, the data for the site (pages and posts) are in WordPress and you pull that data, as needed, into your components.

Data can also live in file types like Markdown, CSV, etc. as well as databases and APIs of all sorts.

Gatsby’s data layer lets you pull data from these (and any other source) directly into your components — in the shape and form you want.

Using Unstructured Data vs GraphQL

Do I have to use GraphQL and source plugins to pull data into Gatsby sites?

Absolutely not! You can use the node createPages API to pull unstructured data into Gatsby pages directly, rather than through the GraphQL data layer. This is a great choice for small sites, while GraphQL and source plugins can help save time with more complex sites.

See the Using Gatsby without GraphQL guide to learn how to pull data into your Gatsby site using the node createPages API and to see an example site!

When do I use unstructured data vs GraphQL?

If you’re building a small site, one efficient way to build it is to pull in unstructured data as outlined in this guide, using createPages API, and then if the site becomes more complex later on, you move on to building more complex sites, or you’d like to transform your data, follow these steps:

  1. Check out the Plugin Library to see if the source plugins and/or transformer plugins you’d like to use already exist
  2. If they don’t exist, read the Plugin Authoring guide and consider building your own!

How Gatsby’s data layer uses GraphQL to pull data into components

There are many options for loading data into React components. One of the most popular and powerful of these is a technology called GraphQL.

GraphQL was invented at Facebook to help product engineers pull needed data into components.

GraphQL is a query language (the QL part of its name). If you’re familiar with SQL, it works in a very similar way. Using a special syntax, you describe the data you want in your component and then that data is given to you.

Gatsby uses GraphQL to enable components to declare the data they need.

Create a new example site

Create another new site for this part of the tutorial. You’re going to build a Markdown blog called “Pandas Eating Lots”. It’s dedicated to showing off the best pictures and videos of pandas eating lots of food. Along the way, you’ll be dipping your toes into GraphQL and Gatsby’s Markdown support.

Open a new terminal window and run the following commands to create a new Gatsby site in a directory called tutorial-part-four. Then navigate to the new directory:

1
2
3

gatsby new tutorial-part-four https://github.com/gatsbyjs/gatsby-starter-hello-world
cd tutorial-part-four

Then install some other needed dependencies at the root of the project. You’ll use the Typography theme “Kirkham”, and you’ll try out a CSS-in-JS library, “Emotion”:

1
2

npm install gatsby-plugin-typography typography react-typography typography-theme-kirkham gatsby-plugin-emotion @emotion/react

Set up a site similar to what you ended with in Part Three. This site will have a layout component and two page components:

src/components/layout.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Copysrc/components/layout.js: copy code to clipboard
import React from "react"
import { css } from "@emotion/react"
import { Link } from "gatsby"

import { rhythm } from "../utils/typography"

export default function Layout({ children }) {
return (
<div
css={css`
margin: 0 auto;
max-width: 700px;
padding: ${rhythm(2)};
padding-top: ${rhythm(1.5)};
`}
>
<Link to={`/`}>
<h3
css={css`
margin-bottom: ${rhythm(2)};
display: inline-block;
font-style: normal;
`}
>
Pandas Eating Lots
</h3>
</Link>
<Link
to={`/about/`}
css={css`
float: right;
`}
>
About
</Link>
{children}
</div>
)
}

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Copysrc/pages/index.js: copy code to clipboard
import React from "react"
import Layout from "../components/layout"

export default function Home() {
return (
<Layout>
<h1>Amazing Pandas Eating Things</h1>
<div>
<img
src="https://2.bp.blogspot.com/-BMP2l6Hwvp4/TiAxeGx4CTI/AAAAAAAAD_M/XlC_mY3SoEw/s1600/panda-group-eating-bamboo.jpg"
alt="Group of pandas eating bamboo"
/>
</div>
</Layout>
)
}

src/pages/about.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Copysrc/pages/about.js: copy code to clipboard
import React from "react"
import Layout from "../components/layout"

export default function About() {
return (
<Layout>
<h1>About Pandas Eating Lots</h1>
<p>
We're the only site running on your computer dedicated to showing the
best photos and videos of pandas eating lots of food.
</p>
</Layout>
)
}

src/utils/typography.js

1
2
3
4
5
6
7
8
Copysrc/utils/typography.js: copy code to clipboard
import Typography from "typography"
import kirkhamTheme from "typography-theme-kirkham"

const typography = new Typography(kirkhamTheme)

export default typography
export const rhythm = typography.rhythm

gatsby-config.js (must be in the root of your project, not under src)

gatsby-config.js

1
2
3
4
5
6
7
8
9
10
11
12
Copygatsby-config.js: copy code to clipboard
module.exports = {
plugins: [
`gatsby-plugin-emotion`,
{
resolve: `gatsby-plugin-typography`,
options: {
pathToConfigModule: `src/utils/typography`,
},
},
],
}

Add the above files and then run gatsby develop, per usual, and you should see the following:

start

You have another small site with a layout and two pages.

Now you can start querying 😋

Your first GraphQL query

When building sites, you’ll probably want to reuse common bits of data — like the site title for example. Look at the /about/ page. You’ll notice that you have the site title (Pandas Eating Lots) in both the layout component (the site header) as well as in the <h1 /> of the about.js page (the page header).

But what if you want to change the site title in the future? You’d have to search for the title across all your components and edit each instance. This is both cumbersome and error-prone, especially for larger, more complex sites. Instead, you can store the title in one location and reference that location from other files; change the title in a single place, and Gatsby will pull your updated title into files that reference it.

The place for these common bits of data is the siteMetadata object in the gatsby-config.js file. Add your site title to the gatsby-config.js file:

gatsby-config.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Copygatsby-config.js: copy code to clipboard
module.exports = {
siteMetadata: {
title: `Title from siteMetadata`,
},
plugins: [
`gatsby-plugin-emotion`,
{
resolve: `gatsby-plugin-typography`,
options: {
pathToConfigModule: `src/utils/typography`,
},
},
],
}

Restart the development server.

Use a page query

Now the site title is available to be queried; Add it to the about.js file using a page query:

src/pages/about.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Copysrc/pages/about.js: copy code to clipboard
import React from "react"
import { graphql } from "gatsby"
import Layout from "../components/layout"

export default function About({ data }) {
return (
<Layout>
<h1>About {data.site.siteMetadata.title}</h1>
<p>
We're the only site running on your computer dedicated to showing the
best photos and videos of pandas eating lots of food.
</p>
</Layout>
)
}

export const query = graphql`
query {
site {
siteMetadata {
title
}
}
}
`

It worked! 🎉

Page title pulling from siteMetadata

The basic GraphQL query that retrieves the title in your about.js changes above is:

src/pages/about.js

1
2
3
4
5
6
7
8
Copysrc/pages/about.js: copy code to clipboard
{
site {
siteMetadata {
title
}
}
}

💡 In part five, you’ll meet a tool that lets us interactively explore the data available through GraphQL, and help formulate queries like the one above.

Page queries live outside of the component definition — by convention at the end of a page component file — and are only available on page components.

Use a StaticQuery

StaticQuery is a new API introduced in Gatsby v2 that allows non-page components (like your layout.js component), to retrieve data via GraphQL queries. Let’s use its newly introduced hook version — useStaticQuery.

Go ahead and make some changes to your src/components/layout.js file to use the useStaticQuery hook and a {data.site.siteMetadata.title} reference that uses this data. When you are done, your file will look like this:

src/components/layout.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Copysrc/components/layout.js: copy code to clipboard
import React from "react"
import { css } from "@emotion/react"
import { useStaticQuery, Link, graphql } from "gatsby"

import { rhythm } from "../utils/typography"
export default function Layout({ children }) {
const data = useStaticQuery(
graphql`
query {
site {
siteMetadata {
title
}
}
}
`
)
return (
<div
css={css`
margin: 0 auto;
max-width: 700px;
padding: ${rhythm(2)};
padding-top: ${rhythm(1.5)};
`}
>
<Link to={`/`}>
<h3
css={css`
margin-bottom: ${rhythm(2)};
display: inline-block;
font-style: normal;
`}
>
{data.site.siteMetadata.title}
</h3>
</Link>
<Link
to={`/about/`}
css={css`
float: right;
`}
>
About
</Link>
{children}
</div>
)
}

Another success! 🎉

Page title and layout title both pulling from siteMetadata

Why use two different queries here? These examples were quick introductions to the query types, how they are formatted, and where they can be used. For now, keep in mind that only pages can make page queries. Non-page components, such as Layout, can use StaticQuery. Part 7 of the tutorial explains these in greater depth.

But let’s restore the real title.

One of the core principles of Gatsby is that creators need an immediate connection to what they’re creating (hat tip to Bret Victor). In other words, when you make any change to code you should immediately see the effect of that change. You manipulate an input of Gatsby and you see the new output showing up on the screen.

So almost everywhere, changes you make will immediately take effect. Edit the gatsby-config.js file again, this time changing the title back to “Pandas Eating Lots”. The change should show up very quickly in your site pages.

Both titles say Pandas Eating Lots

What’s coming next?

Next, you’ll be learning about how to pull data into your Gatsby site using GraphQL with source plugins in part five of the tutorial.

Source Plugins

This tutorial is part of a series about Gatsby’s data layer. Make sure you’ve gone through part 4 before continuing here.

What’s in this tutorial?

In this tutorial, you’ll be learning about how to pull data into your Gatsby site using GraphQL and source plugins. Before you learn about these plugins, however, you’ll want to know how to use something called GraphiQL, a tool that helps you structure your queries correctly.

Introducing GraphiQL

GraphiQL is the GraphQL integrated development environment (IDE). It’s a powerful (and all-around awesome) tool you’ll use often while building Gatsby websites.

You can access it when your site’s development server is running—normally at http://localhost:8000/___graphql.

Poke around the built-in Site “type” and see what fields are available on it — including the siteMetadata object you queried earlier. Try opening GraphiQL and play with your data! Press Ctrl + Space (or use Shift + Space as an alternate keyboard shortcut) to bring up the autocomplete window and Ctrl + Enter to run the GraphQL query. You’ll be using GraphiQL a lot more through the remainder of the tutorial.

Using the GraphiQL Explorer

The GraphiQL Explorer enables you to interactively construct full queries by clicking through available fields and inputs without the repetitive process of typing these queries out by hand.

Video hosted on egghead.io.

Source plugins

Data in Gatsby sites can come from anywhere: APIs, databases, CMSs, local files, etc.

Source plugins fetch data from their source. E.g. the filesystem source plugin knows how to fetch data from the file system. The WordPress plugin knows how to fetch data from the WordPress API.

Add gatsby-source-filesystem and explore how it works.

First, install the plugin at the root of the project:

1
2

npm install gatsby-source-filesystem

Then add it to your gatsby-config.js:

gatsby-config.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Copygatsby-config.js: copy code to clipboard
module.exports = {
siteMetadata: {
title: `Pandas Eating Lots`,
},
plugins: [
{
resolve: `gatsby-source-filesystem`,
options: {
name: `src`,
path: `${__dirname}/src/`,
},
},
`gatsby-plugin-emotion`,
{
resolve: `gatsby-plugin-typography`,
options: {
pathToConfigModule: `src/utils/typography`,
},
},
],
}

Save that and restart the gatsby development server. Then open up GraphiQL again.

In the explorer pane, you’ll see allFile and file available as selections:

The GraphiQL IDE showing the new dropdown options provided by the gatsby-source-filesystem plugin

Click the allFile dropdown. Position your cursor after allFile in the query area, and then type Ctrl + Enter. This will pre-fill a query for the id of each file. Press “Play” to run the query:

The GraphiQL IDE showing the results of a filesystem query

In the Explorer pane, the id field has automatically been selected. Make selections for more fields by checking the field’s corresponding checkbox. Press “Play” to run the query again, with the new fields:

The GraphiQL IDE showing the new fields in the Explorer column

Alternatively, you can add fields by using the autocomplete shortcut (Ctrl + Space). This will show queryable fields on the File nodes.

The GraphiQL IDE showing the gatsby-source-filesystem plugin's new autocomplete options

Try adding a number of fields to your query, press Ctrl + Enter each time to re-run the query. You’ll see the updated query results:

The GraphiQL IDE showing the results of the query

The result is an array of File “nodes” (node is a fancy name for an object in a “graph”). Each File node object has the fields you queried for.

Build a page with a GraphQL query

Building new pages with Gatsby often starts in GraphiQL. You first sketch out the data query by playing in GraphiQL then copy this to a React page component to start building the UI.

Let’s try this.

Create a new file at src/pages/my-files.js with the allFile GraphQL query you just created:

src/pages/my-files.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Copysrc/pages/my-files.js: copy code to clipboard
import React from "react"
import { graphql } from "gatsby"
import Layout from "../components/layout"

export default function MyFiles({ data }) {
console.log(data)
return (
<Layout>
<div>Hello world</div>
</Layout>
)
}

export const query = graphql`
query {
allFile {
edges {
node {
relativePath
prettySize
extension
birthTime(fromNow: true)
}
}
}
}
`

The console.log(data) line is highlighted above. It’s often helpful when creating a new component to console out the data you’re getting from the GraphQL query so you can explore the data in your browser console while building the UI.

If you visit the new page at /my-files/ and open up your browser console you will see something like:

Browser console showing the structure of the data object

The shape of the data matches the shape of the GraphQL query.

Add some code to your component to print out the File data.

src/pages/my-files.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Copysrc/pages/my-files.js: copy code to clipboard
import React from "react"
import { graphql } from "gatsby"
import Layout from "../components/layout"

export default function MyFiles({ data }) {
console.log(data)
return (
<Layout>
<div>
<h1>My Site's Files</h1>
<table>
<thead>
<tr>
<th>relativePath</th>
<th>prettySize</th>
<th>extension</th>
<th>birthTime</th>
</tr>
</thead>
<tbody>
{data.allFile.edges.map(({ node }, index) => (
<tr key={index}>
<td>{node.relativePath}</td>
<td>{node.prettySize}</td>
<td>{node.extension}</td>
<td>{node.birthTime}</td>
</tr>
))}
</tbody>
</table>
</div>
</Layout>
)
}

export const query = graphql`
query {
allFile {
edges {
node {
relativePath
prettySize
extension
birthTime(fromNow: true)
}
}
}
}
`

And now visit http://localhost:8000/my-files… 😲

A browser window showing a list of the files in the site

What’s coming next?

Now you’ve learned how source plugins bring data into Gatsby’s data system. In the next tutorial, you’ll learn how transformer plugins transform the raw content brought by source plugins. The combination of source plugins and transformer plugins can handle all data sourcing and data transformation you might need when building a Gatsby site. Learn about transformer plugins in part six of the tutorial.

Transformer plugins

This tutorial is part of a series about Gatsby’s data layer. Make sure you’ve gone through part 4 and part 5 before continuing here.

What’s in this tutorial?

The previous tutorial showed how source plugins bring data into Gatsby’s data system. In this tutorial, you’ll learn how transformer plugins transform the raw content brought by source plugins. The combination of source plugins and transformer plugins can handle all data sourcing and data transformation you might need when building a Gatsby site.

Transformer plugins

Often, the format of the data you get from source plugins isn’t what you want to use to build your website. The filesystem source plugin lets you query data about files but what if you want to query data inside files?

To make this possible, Gatsby supports transformer plugins which take raw content from source plugins and transform it into something more usable.

For example, markdown files. Markdown is nice to write in but when you build a page with it, you need the markdown to be HTML.

Add a markdown file to your site at src/pages/sweet-pandas-eating-sweets.md (This will become your first markdown blog post) and learn how to transform it to HTML using transformer plugins and GraphQL.

src/pages/sweet-pandas-eating-sweets.md

1
2
3
4
5
6
7
8
9
10
11
Copysrc/pages/sweet-pandas-eating-sweets.md: copy code to clipboard
---
title: "Sweet Pandas Eating Sweets"
date: "2017-08-10"
---

Pandas are really sweet.

Here's a video of a panda eating sweets.

<iframe width="560" height="315" src="https://www.youtube.com/embed/4n0xNbfJLR8" frameborder="0" allowfullscreen></iframe>

Once you save the file, look at /my-files/ again—the new markdown file is in the table. This is a very powerful feature of Gatsby. Like the earlier siteMetadata example, source plugins can live-reload data. gatsby-source-filesystem is always scanning for new files to be added and when they are, re-runs your queries.

Add a transformer plugin that can transform markdown files:

1
2

npm install gatsby-transformer-remark

Then add it to the gatsby-config.js like normal:

gatsby-config.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Copygatsby-config.js: copy code to clipboard
module.exports = {
siteMetadata: {
title: `Pandas Eating Lots`,
},
plugins: [
{
resolve: `gatsby-source-filesystem`,
options: {
name: `src`,
path: `${__dirname}/src/`,
},
},
`gatsby-transformer-remark`,
`gatsby-plugin-emotion`,
{
resolve: `gatsby-plugin-typography`,
options: {
pathToConfigModule: `src/utils/typography`,
},
},
],
}

Restart the development server then refresh (or open again) GraphiQL and look at the autocomplete:

GraphiQL screenshot showing new `gatsby-transformer-remark` autocomplete options

Select allMarkdownRemark again and run it as you did for allFile. You’ll see there the markdown file you recently added. Explore the fields that are available on the MarkdownRemark node.

GraphiQL screenshot showing the result of a query

Ok! Hopefully, some basics are starting to fall into place. Source plugins bring data into Gatsby’s data system and transformer plugins transform raw content brought by source plugins. This pattern can handle all data sourcing and data transformation you might need when building a Gatsby site.

Create a list of your site’s markdown files in src/pages/index.js

Now you’ll have to create a list of your markdown files on the front page. Like many blogs, you want to end up with a list of links on the front page pointing to each blog post. With GraphQL you can query for the current list of markdown blog posts so you won’t need to maintain the list manually.

Like with the src/pages/my-files.js page, replace src/pages/index.js with the following to add a GraphQL query with some initial HTML and styling.

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
Copysrc/pages/index.js: copy code to clipboard
import React from "react"
import { graphql } from "gatsby"
import { css } from "@emotion/react"
import { rhythm } from "../utils/typography"
import Layout from "../components/layout"

export default function Home({ data }) {
console.log(data)
return (
<Layout>
<div>
<h1
css={css`
display: inline-block;
border-bottom: 1px solid;
`}
>
Amazing Pandas Eating Things
</h1>
<h4>{data.allMarkdownRemark.totalCount} Posts</h4>
{data.allMarkdownRemark.edges.map(({ node }) => (
<div key={node.id}>
<h3
css={css`
margin-bottom: ${rhythm(1 / 4)};
`}
>
{node.frontmatter.title}{" "}
<span
css={css`
color: #bbb;
`}
>
— {node.frontmatter.date}
</span>
</h3>
<p>{node.excerpt}</p>
</div>
))}
</div>
</Layout>
)
}

export const query = graphql`
query {
allMarkdownRemark {
totalCount
edges {
node {
id
frontmatter {
title
date(formatString: "DD MMMM, YYYY")
}
excerpt
}
}
}
}
`

Now the frontpage should look like:

Screenshot of the frontpage

But your one blog post looks a bit lonely. So let’s add another one at src/pages/pandas-and-bananas.md

src/pages/pandas-and-bananas.md

1
2
3
4
5
6
7
8
9
Copysrc/pages/pandas-and-bananas.md: copy code to clipboard
---
title: "Pandas and Bananas"
date: "2017-08-21"
---

Do Pandas eat bananas? Check out this short video that shows that yes! pandas do seem to really enjoy bananas!

<iframe width="560" height="315" src="https://www.youtube.com/embed/4SZl1r2O_bY" frameborder="0" allowfullscreen></iframe>

Frontpage showing two posts

Which looks great! Except… the order of the posts is wrong.

But this is easy to fix. When querying a connection of some type, you can pass a variety of arguments to the GraphQL query. You can sort and filter nodes, set how many nodes to skip, and choose the limit of how many nodes to retrieve. With this powerful set of operators, you can select any data you want—in the format you need.

In your index page’s GraphQL query, change allMarkdownRemark to allMarkdownRemark(sort: { fields: [frontmatter___date], order: DESC }). Note: There are 3 underscores between frontmatter and date. Save this and the sort order should be fixed.

Try opening GraphiQL and playing with different sort options. You can sort the allFile connection along with other connections.

For more documentation on our query operators, explore our GraphQL reference guide.

Challenge

Try creating a new page containing a blog post and see what happens to the list of blog posts on the homepage!

What’s coming next?

This is great! You’ve just created a nice index page where you’re querying your markdown files and producing a list of blog post titles and excerpts. But you don’t want to just see excerpts, you want actual pages for your markdown files.

You could continue to create pages by placing React components in src/pages. However, you’ll next learn how to programmatically create pages from data. Gatsby is not limited to making pages from files like many static site generators. Gatsby lets you use GraphQL to query your data and map the query results to pages—all at build time. This is a really powerful idea. You’ll be exploring its implications and ways to use it in the next tutorial, where you’ll learn how to programmatically create pages from data.

Programmatically create pages from data

This tutorial is part of a series about Gatsby’s data layer. Make sure you’ve gone through part 4, part 5, and part 6 before continuing here.

What’s in this tutorial?

In the previous tutorial, you created a nice index page that queries markdown files and produces a list of blog post titles and excerpts. But you don’t want to just see excerpts, you want actual pages for your markdown files.

You could continue to create pages by placing React components in src/pages. However, you’ll now learn how to programmatically create pages from data. Gatsby is not limited to making pages from files like many static site generators. Gatsby lets you use GraphQL to query your data and map the query results to pages—all at build time. This is a really powerful idea. You’ll be exploring its implications and ways to use it for the remainder of this part of the tutorial.

Let’s get started.

Creating slugs for pages

A ‘slug’ is the unique identifying part of a web address, such as the /docs/tutorial/part-seven part of the page https://www.gatsbyjs.com/docs/tutorial/part-seven/.

It is also referred to as the ‘path’ but this tutorial will use the term ‘slug’ for consistency.

Creating new pages has two steps:

  1. Generate the “path” or “slug” for the page.
  2. Create the page.

Note: Often data sources will directly provide a slug or pathname for content — when working with one of those systems (e.g. a CMS), you don’t need to create the slugs yourself as you do with markdown files.

To create your markdown pages, you’ll learn to use two Gatsby APIs: onCreateNode and createPages. These are two workhorse APIs you’ll see used in many sites and plugins.

We do our best to make Gatsby APIs simple to implement. To implement an API, you export a function with the name of the API from gatsby-node.js.

So, here’s where you’ll do that. In the root of your site, create a file named gatsby-node.js. Then add the following.

gatsby-node.js

1
2
3
4
Copygatsby-node.js: copy code to clipboard
exports.onCreateNode = ({ node }) => {
console.log(`Node created of type "${node.internal.type}"`)
}

This onCreateNode function will be called by Gatsby whenever a new node is created (or updated).

Stop and restart the development server. As you do, you’ll see quite a few newly created nodes get logged to the terminal console.

In the next section, you will use this API to add slugs for your Markdown pages to MarkdownRemark nodes.

Change your function so it now only logs MarkdownRemark nodes.

gatsby-node.js

1
2
3
4
5
6
Copygatsby-node.js: copy code to clipboard
exports.onCreateNode = ({ node }) => {
if (node.internal.type === `MarkdownRemark`) {
console.log(node.internal.type)
}
}

You want to use each markdown file name to create the page slug. So pandas-and-bananas.md will become /pandas-and-bananas/. But how do you get the file name from the MarkdownRemark node? To get it, you need to traverse the “node graph” to its parent File node, as File nodes contain data you need about files on disk. To do that, you’ll use the getNode() helper. Add it to onCreateNode’s function parameters, and call it to get the file node:

gatsby-node.js

1
2
3
4
5
6
7
Copygatsby-node.js: copy code to clipboard
exports.onCreateNode = ({ node, getNode }) => {
if (node.internal.type === `MarkdownRemark`) {
const fileNode = getNode(node.parent)
console.log(`\n`, fileNode.relativePath)
}
}

After restarting your development server, you should see the relative paths for your two markdown files print to the terminal screen.

markdown-relative-path

Now you’ll have to create slugs. As the logic for creating slugs from file names can get tricky, the gatsby-source-filesystem plugin ships with a function for creating slugs. Let’s use that.

gatsby-node.js

1
2
3
4
5
6
7
8
Copygatsby-node.js: copy code to clipboard
const { createFilePath } = require(`gatsby-source-filesystem`)

exports.onCreateNode = ({ node, getNode }) => {
if (node.internal.type === `MarkdownRemark`) {
console.log(createFilePath({ node, getNode, basePath: `pages` }))
}
}

The function handles finding the parent File node along with creating the slug. Run the development server again and you should see logged to the terminal two slugs, one for each markdown file.

Now you can add your new slugs directly onto the MarkdownRemark nodes. This is powerful, as any data you add to nodes is available to query later with GraphQL. So, it’ll be easy to get the slug when it comes time to create the pages.

To do so, you’ll use a function passed to your API implementation called createNodeField. This function allows you to create additional fields on nodes created by other plugins. Only the original creator of a node can directly modify the node—all other plugins (including your gatsby-node.js) must use this function to create additional fields.

gatsby-node.js

1
2
3
4
5
6
7
8
9
10
11
12
13
Copygatsby-node.js: copy code to clipboard
const { createFilePath } = require(`gatsby-source-filesystem`)
exports.onCreateNode = ({ node, getNode, actions }) => {
const { createNodeField } = actions
if (node.internal.type === `MarkdownRemark`) {
const slug = createFilePath({ node, getNode, basePath: `pages` })
createNodeField({
node,
name: `slug`,
value: slug,
})
}
}

Restart the development server and open or refresh GraphiQL. Then run this GraphQL query to see your new slugs.

1
2
3
4
5
6
7
8
9
10
11
12

{
allMarkdownRemark {
edges {
node {
fields {
slug
}
}
}
}
}

Now that the slugs are created, you can create the pages.

Creating pages

In the same gatsby-node.js file, add the following.

gatsby-node.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Copygatsby-node.js: copy code to clipboard
const { createFilePath } = require(`gatsby-source-filesystem`)

exports.onCreateNode = ({ node, getNode, actions }) => {
const { createNodeField } = actions
if (node.internal.type === `MarkdownRemark`) {
const slug = createFilePath({ node, getNode, basePath: `pages` })
createNodeField({
node,
name: `slug`,
value: slug,
})
}
}

exports.createPages = async ({ graphql, actions }) => {
// **Note:** The graphql function call returns a Promise
// see: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise for more info
const result = await graphql(`
query {
allMarkdownRemark {
edges {
node {
fields {
slug
}
}
}
}
}
`)
console.log(JSON.stringify(result, null, 4))
}

You’ve added an implementation of the createPages API which Gatsby calls so plugins can add pages.

As mentioned in the intro to this part of the tutorial, the steps to programmatically creating pages are:

  1. Query data with GraphQL
  2. Map the query results to pages

The above code is the first step for creating pages from your markdown as you’re using the supplied graphql function to query the markdown slugs you created. Then you’re logging out the result of the query which should look like:

query-markdown-slugs

You need one additional thing beyond a slug to create pages: a page template component. Like everything in Gatsby, programmatic pages are powered by React components. When creating a page, you need to specify which component to use.

Create a directory at src/templates, and then add the following in a file named src/templates/blog-post.js.

src/templates/blog-post.js

1
2
3
4
5
6
7
8
9
10
11
Copysrc/templates/blog-post.js: copy code to clipboard
import React from "react"
import Layout from "../components/layout"

export default function BlogPost() {
return (
<Layout>
<div>Hello blog post</div>
</Layout>
)
}

Then update gatsby-node.js

gatsby-node.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Copygatsby-node.js: copy code to clipboard
const path = require(`path`)
const { createFilePath } = require(`gatsby-source-filesystem`)

exports.onCreateNode = ({ node, getNode, actions }) => {
const { createNodeField } = actions
if (node.internal.type === `MarkdownRemark`) {
const slug = createFilePath({ node, getNode, basePath: `pages` })
createNodeField({
node,
name: `slug`,
value: slug,
})
}
}

exports.createPages = async ({ graphql, actions }) => {
const { createPage } = actions
const result = await graphql(`
query {
allMarkdownRemark {
edges {
node {
fields {
slug
}
}
}
}
}
`)

result.data.allMarkdownRemark.edges.forEach(({ node }) => {
createPage({
path: node.fields.slug,
component: path.resolve(`./src/templates/blog-post.js`),
context: {
// Data passed to context is available
// in page queries as GraphQL variables.
slug: node.fields.slug,
},
})
})
}

Restart the development server and your pages will be created! An easy way to find new pages you create while developing is to go to a random path where Gatsby will helpfully show you a list of pages on the site. If you go to http://localhost:8000/sdf, you’ll see the new pages you created.

new-pages

Visit one of them and you see:

hello-world-blog-post

Which is a bit boring and not what you want. Now you can pull in data from your markdown post. Change src/templates/blog-post.js to:

src/templates/blog-post.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Copysrc/templates/blog-post.js: copy code to clipboard
import React from "react"
import { graphql } from "gatsby"
import Layout from "../components/layout"

export default function BlogPost({ data }) {
const post = data.markdownRemark
return (
<Layout>
<div>
<h1>{post.frontmatter.title}</h1>
<div dangerouslySetInnerHTML={{ __html: post.html }} />
</div>
</Layout>
)
}

export const query = graphql`
query($slug: String!) {
markdownRemark(fields: { slug: { eq: $slug } }) {
html
frontmatter {
title
}
}
}
`

And…

blog-post

Sweet!

The last step is to link to your new pages from the index page.

Return to src/pages/index.js, query for your markdown slugs, and create links.

src/pages/index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
Copysrc/pages/index.js: copy code to clipboard
import React from "react"
import { css } from "@emotion/react"
import { Link, graphql } from "gatsby"
import { rhythm } from "../utils/typography"
import Layout from "../components/layout"

export default function Home({ data }) {
return (
<Layout>
<div>
<h1
css={css`
display: inline-block;
border-bottom: 1px solid;
`}
>
Amazing Pandas Eating Things
</h1>
<h4>{data.allMarkdownRemark.totalCount} Posts</h4>
{data.allMarkdownRemark.edges.map(({ node }) => (
<div key={node.id}>
<Link
to={node.fields.slug}
css={css`
text-decoration: none;
color: inherit;
`}
>
<h3
css={css`
margin-bottom: ${rhythm(1 / 4)};
`}
>
{node.frontmatter.title}{" "}
<span
css={css`
color: #555;
`}
>
— {node.frontmatter.date}
</span>
</h3>
<p>{node.excerpt}</p>
</Link>
</div>
))}
</div>
</Layout>
)
}

export const query = graphql`
query {
allMarkdownRemark(sort: { fields: [frontmatter___date], order: DESC }) {
totalCount
edges {
node {
id
frontmatter {
title
date(formatString: "DD MMMM, YYYY")
}
fields {
slug
}
excerpt
}
}
}
}
`

And there you go! A working, albeit small, blog!

Challenge

Try playing more with the site. Try adding some more markdown files. Explore querying other data from the MarkdownRemark nodes and adding them to the front page or blog posts pages.

In this part of the tutorial, you’ve learned the foundations of building with Gatsby’s data layer. You’ve learned how to source and transform data using plugins, how to use GraphQL to map data to pages, and then how to build page template components where you query for data for each page.

What’s coming next?

Now that you’ve built a Gatsby site, where do you go next?

Preparing a Site to Go Live

Wow! You’ve come a long way! You’ve learned how to:

  • create new Gatsby sites
  • create pages and components
  • style components
  • add plugins to a site
  • source & transform data
  • use GraphQL to query data for pages
  • programmatically create pages from your data

In this final section, you’re going to walk through some common steps for preparing a site to go live by introducing a powerful site diagnostic tool called Lighthouse. Along the way, we’ll introduce a few more plugins you’ll often want to use in your Gatsby sites.

Audit with Lighthouse

Quoting from the Lighthouse website:

Lighthouse is an open-source, automated tool for improving the quality of web pages. You can run it against any web page, public or requiring authentication. It has audits for performance, accessibility, progressive web apps (PWAs), and more.

Lighthouse is included in Chrome DevTools. Running its audit — and then addressing the errors it finds and implementing the improvements it suggests — is a great way to prepare your site to go live. It helps give you confidence that your site is as fast and accessible as possible.

Try it out!

First, you need to create a production build of your Gatsby site. The Gatsby development server is optimized for making development fast; But the site that it generates, while closely resembling a production version of the site, isn’t as optimized.

✋ Create a production build

  1. Stop the development server (if it’s still running) and run the following command:
1
2

gatsby build

💡 As you learned in part 1, this does a production build of your site and outputs the built static files into the public directory.

  1. View the production site locally. Run:
1
2

gatsby serve

Once this starts, you can view your site at http://localhost:9000.

Run a Lighthouse audit

Now you’re going to run your first Lighthouse test.

  1. If you haven’t already done so, open the site in Chrome Incognito Mode so no extensions interfere with the test. Then, open up the Chrome DevTools.
  2. Click on the “Audits” tab where you’ll see a screen that looks like:

Lighthouse audit start

  1. Click “Perform an audit…” (All available audit types should be selected by default). Then click “Run audit”. (It’ll then take a minute or so to run the audit). Once the audit is complete, you should see results that look like this:

Lighthouse audit results

As you can see, Gatsby’s performance is excellent out of the box but you’re missing some things for PWA, Accessibility, Best Practices, and SEO that will improve your scores (and in the process make your site much more friendly to visitors and search engines).

Add a manifest file

Looks like you have a pretty lackluster score in the “Progressive Web App” category. Let’s address that.

But first, what exactly are PWAs?

They are regular websites that take advantage of modern browser functionality to augment the web experience with app-like features and benefits. Check out Google’s overview of what defines a PWA experience.

Inclusion of a web app manifest is one of the three generally accepted baseline requirements for a PWA.

Quoting Google:

The web app manifest is a simple JSON file that tells the browser about your web application and how it should behave when ‘installed’ on the user’s mobile device or desktop.

Gatsby’s manifest plugin configures Gatsby to create a manifest.webmanifest file on every site build.

✋ Using gatsby-plugin-manifest

  1. Install the plugin:
1
2

npm install gatsby-plugin-manifest
  1. Add a favicon for your app under src/images/icon.png. For the purposes of this tutorial you can use this example icon, should you not have one available. The icon is necessary to build all images for the manifest. For more information, look at the docs for gatsby-plugin-manifest.
  2. Add the plugin to the plugins array in your gatsby-config.js file.

gatsby-config.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Copygatsby-config.js: copy code to clipboard
{
plugins: [
{
resolve: `gatsby-plugin-manifest`,
options: {
name: `GatsbyJS`,
short_name: `GatsbyJS`,
start_url: `/`,
background_color: `#6b37bf`,
theme_color: `#6b37bf`,
// Enables "Add to Homescreen" prompt and disables browser UI (including back button)
// see https://developers.google.com/web/fundamentals/web-app-manifest/#display
display: `standalone`,
icon: `src/images/icon.png`, // This path is relative to the root of the site.
},
},
]
}

That’s all you need to get started with adding a web manifest to a Gatsby site. The example given reflects a base configuration — Check out the plugin reference for more options.

Add offline support

Another requirement for a website to qualify as a PWA is the use of a service worker. A service worker runs in the background, deciding to serve network or cached content based on connectivity, allowing for a seamless, managed offline experience.

Gatsby’s offline plugin makes a Gatsby site work offline and more resistant to bad network conditions by creating a service worker for your site.

✋ Using gatsby-plugin-offline

  1. Install the plugin:
1
2

npm install gatsby-plugin-offline
  1. Add the plugin to the plugins array in your gatsby-config.js file.

gatsby-config.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Copygatsby-config.js: copy code to clipboard
{
plugins: [
{
resolve: `gatsby-plugin-manifest`,
options: {
name: `GatsbyJS`,
short_name: `GatsbyJS`,
start_url: `/`,
background_color: `#6b37bf`,
theme_color: `#6b37bf`,
// Enables "Add to Homescreen" prompt and disables browser UI (including back button)
// see https://developers.google.com/web/fundamentals/web-app-manifest/#display
display: `standalone`,
icon: `src/images/icon.png`, // This path is relative to the root of the site.
},
},
`gatsby-plugin-offline`,
]
}

That’s all you need to get started with service workers with Gatsby.

💡 The offline plugin should be listed after the manifest plugin so that the offline plugin can cache the created manifest.webmanifest.

Add page metadata

Adding metadata to pages (such as a title or description) is key in helping search engines like Google understand your content and decide when to surface it in search results.

React Helmet is a package that provides a React component interface for you to manage your document head.

Gatsby’s react helmet plugin provides drop-in support for server rendering data added with React Helmet. Using the plugin, attributes you add to React Helmet will be added to the static HTML pages that Gatsby builds.

✋ Using React Helmet and gatsby-plugin-react-helmet

  1. Install both packages:
1
2

npm install gatsby-plugin-react-helmet react-helmet
  1. Make sure you have a description and an author configured inside your siteMetadata object. Also, add the gatsby-plugin-react-helmet plugin to the plugins array in your gatsby-config.js file.

gatsby-config.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Copygatsby-config.js: copy code to clipboard
module.exports = {
siteMetadata: {
title: `Pandas Eating Lots`,
description: `A simple description about pandas eating lots...`,
author: `gatsbyjs`,
},
plugins: [
{
resolve: `gatsby-plugin-manifest`,
options: {
name: `GatsbyJS`,
short_name: `GatsbyJS`,
start_url: `/`,
background_color: `#6b37bf`,
theme_color: `#6b37bf`,
// Enables "Add to Homescreen" prompt and disables browser UI (including back button)
// see https://developers.google.com/web/fundamentals/web-app-manifest/#display
display: `standalone`,
icon: `src/images/icon.png`, // This path is relative to the root of the site.
},
},
`gatsby-plugin-offline`,
`gatsby-plugin-react-helmet`,
],
}
  1. In the src/components directory, create a file called seo.js and add the following:

src/components/seo.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
Copysrc/components/seo.js: copy code to clipboard
import React from "react"
import PropTypes from "prop-types"
import { Helmet } from "react-helmet"
import { useStaticQuery, graphql } from "gatsby"

function SEO({ description, lang, meta, title }) {
const { site } = useStaticQuery(
graphql`
query {
site {
siteMetadata {
title
description
author
}
}
}
`
)

const metaDescription = description || site.siteMetadata.description

return (
<Helmet
htmlAttributes={{
lang,
}}
title={title}
titleTemplate={`%s | ${site.siteMetadata.title}`}
meta={[
{
name: `description`,
content: metaDescription,
},
{
property: `og:title`,
content: title,
},
{
property: `og:description`,
content: metaDescription,
},
{
property: `og:type`,
content: `website`,
},
{
name: `twitter:card`,
content: `summary`,
},
{
name: `twitter:creator`,
content: site.siteMetadata.author,
},
{
name: `twitter:title`,
content: title,
},
{
name: `twitter:description`,
content: metaDescription,
},
].concat(meta)}
/>
)
}

SEO.defaultProps = {
lang: `en`,
meta: [],
description: ``,
}

SEO.propTypes = {
description: PropTypes.string,
lang: PropTypes.string,
meta: PropTypes.arrayOf(PropTypes.object),
title: PropTypes.string.isRequired,
}

export default SEO

The above code sets up defaults for your most common metadata tags and provides you an <SEO> component to work within the rest of your project. Pretty cool, right?

  1. Now, you can use the <SEO> component in your templates and pages and pass props to it. For example, add it to your blog-post.js template like so:

src/templates/blog-post.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Copysrc/templates/blog-post.js: copy code to clipboard
import React from "react"
import { graphql } from "gatsby"
import Layout from "../components/layout"
import SEO from "../components/seo"

export default function BlogPost({ data }) {
const post = data.markdownRemark
return (
<Layout>
<SEO title={post.frontmatter.title} description={post.excerpt} />
<div>
<h1>{post.frontmatter.title}</h1>
<div dangerouslySetInnerHTML={{ __html: post.html }} />
</div>
</Layout>
)
}

export const query = graphql`
query($slug: String!) {
markdownRemark(fields: { slug: { eq: $slug } }) {
html
frontmatter {
title
}
excerpt
}
}
`

The above example is based off the Gatsby Starter Blog. By passing props to the <SEO> component, you can dynamically change the metadata for a post. In this case, the blog post title and excerpt (if it exists in the blog post markdown file) will be used instead of the default siteMetadata properties in your gatsby-config.js file.

Now, if you run the Lighthouse audit again as laid out above, you should get close to—if not a perfect— 100 score!

💡 For further reading and examples, check out Adding an SEO Component and the React Helmet docs!

Keep making it better

In this section, we’ve shown you a few Gatsby-specific tools to improve your site’s performance and prepare to go live.

Lighthouse is a great tool for site improvements and learning — Continue looking through the detailed feedback it provides and keep making your site better!

Next Steps

Official Documentation

Official Plugins

  • Official Plugins: The complete list of all the Official Plugins maintained by Gatsby.

Official Starters

  1. Gatsby’s Default Starter: Kick off your project with this default boilerplate. This barebones starter ships with the main Gatsby configuration files you might need. working example
  2. Gatsby’s Blog Starter: Gatsby starter for creating an awesome and blazing-fast blog. working example
  3. Gatsby’s Hello-World Starter: Gatsby Starter with the bare essentials needed for a Gatsby site. working example

That’s all, folks

Well, not quite; just for this tutorial. There are additional How-To Guides to check out for more guided use cases.

This is just the beginning. Keep going!

Check out the “how to contribute” docs for even more ideas.

We can’t wait to see what you do 😄.

Gatsby Tutorial:https://www.gatsbyjs.com/docs/tutorial/part-eight/

How to Make React SEO-Friendly: an Extensive SEO Guide

React-driven single-page applications (SPAs) are gaining momentum. Used by tech giants like Facebook, Twitter, and Google, React allows for building fast, responsive, and animation-rich websites and web applications with a smooth user experience.

However, products created with React (or any other modern JavaScript framework like Angular, Vue, or Svelte) have limited capabilities for search engine optimization. This becomes a problem if a website mostly acquires customers through website content and search marketing.

Luckily, there are a couple of ready-made solutions for React that can help you achieve visibility in search engines. We talk about them below.

But before we dive deep into technical details of optimizing single-page applications (SPA), let’s deal with some useful theory to help you understand the challenges with SEO in React websites.

What’s a single-page app and why React?

A single-page application is a web application whose content is served in a single HTML page. This page is dynamically updated, but it doesn’t fully reload each time a user interacts with it. Instead of sending a new request to the server for each interaction and then receiving a whole new page with new content, the client side of a single-page web app only requests and loads data that needs to be updated.

Facebook, Twitter, Netflix, Trello, Gmail, and Slack are all examples of single-page apps.

From the developer’s perspective, SPAs keep the presentation layer (the front end) and the data layer (the back end) separate, so two teams can develop these parts in parallel. Also, with an SPA, it’s easier to scale and use a microservices architecture than it is with a multi-page app.

Read also: Single-page Apps vs Multi-page Web Apps: What to Choose for Web Development

From the user’s perspective, single-page apps offer seamless interaction with a web app. Plus, these apps can cache data in local storage, meaning they’ll load even faster the second time.

When it comes to building an SPA, developers can use one of the popular JavaScript frameworks: React, Angular, or Vue. Among these big three, React holds the top spot. React is the most popular JavaScript framework according to the 2019 State of JavaScript Survey.

Thanks to React’s component-based architecture, it’s easy to reuse code and divide a large app into smaller parts. Hence, maintaining and debugging large SPA projects is much easier than doing the same for large multi-page apps. The virtual DOM ensures high app performance. Also, the React library supports all modern browsers, including their older versions.

Read also: 7 Best JavaScript Frameworks and Libraries for Frontend Web Development

But as you might have guessed, React has disadvantages as well, and one of the biggest is the lack of SEO-friendliness.

Principles of search engine optimization

Search engine optimization (SEO) is the practice of increasing the quality and quantity of web traffic to a website by increasing its visibility on search engines.

The main task of SEO is to help a web app rank in search engines (Google, Bing) when the target audience searches for information by a certain keyword. Most SEO efforts focus on optimizing apps for Google. After all, as of 2020 Google has a 92.16% search engine market share according to StatCounter.

According to Backlinko, 75% of all user clicks go to the top three websites in search engine results, while websites from the second results page and beyond get only 0.78% of clicks. That’s why digital businesses are furiously fighting to get on this first page of search results.

For business owners, it’s crucial to think of an SEO optimization strategy from the very beginning of web app development to find an effective technology stack.

To determine a website’s ranking in search results, search engines use web crawlers. A web crawler is a bot that regularly visits web pages and analyzes them according to specific criteria set by the search engine. Each search engine has its own crawler, and Google’s crawler is called the Googlebot.

This is how crawlers work

The Googlebot explores pages link by link, gathers information on website freshness, content uniqueness, number of backlinks, etc., downloads HTML and CSS files, and sends all of this data to Google servers. Then it’s analyzed and indexed by a system called Caffeine. This is a fully automatic process, so it’s vital to ensure that crawlers correctly understand the website content. And here’s where the problem appears.

What’s wrong with optimizing single-page applications (SPA) for search engines?

Typically, single-page apps load pages on the client side: at first, the page is an empty container; then JavaScript pushes content to this container. The simplest HTML document for a React app might look as follows:

1
2
3
4
5
6
7
8
9
10
<html lang="en">
<head>
<title>React App</title>
</head>
<body>
<noscript>You need to enable JavaScript to run this app.</noscript>
<div id="root"></div>
<script src="/static/js/bundle.js"></script>
</body>
</html>

As you can see, there’s nothing except the

tag and an external script. Single-page applications need a browser to run a script, and only after the script is run will the content be dynamically loaded to the web page. So when a crawler visits the website, it sees an empty page without content. Hence, the page cannot be indexed.

In autumn 2015, Google announced that their bots would also inspect the JavaScript and CSS of web pages, so they would be able to render and understand web pages like browsers do. That was great news, but there are still problems with SEO optimization:

  • Long delays

If the content on a page updates frequently, crawlers should regularly revisit the page. This can cause problems, since reindexing may only be done a week later after the content is updated, as Google Chrome developer Paul Kinlan reports on Twitter.

This happens because the Google Web Rendering Service (WRS) enters the game. After a bot has downloaded HTML, CSS, and JavaScript files, the WRS runs the JavaScript code, fetches data from APIs, and only after that sends the data to Google’s servers.

  • Limited crawling budget

The crawl budget is the maximum number of pages on your website that a crawler can process in a certain period of time. Once that time is up, the bot leaves the site no matter how many pages it’s downloaded (whether that’s 26, 3, or 0). If each page takes too long to load because of running scripts, the bot will simply leave your website before indexing it.

Talking about other search engines, Yahoo’s and Bing’s crawlers will still see an empty page instead of dynamically loaded content. So getting your React-based SPA to rank at the top on these search engines is a will-o’-the-wisp.

You should think of how to solve this problem on the stage of designing app architecture.

Solving the problem

There are a couple of ways to make a React app SEO-friendly: by creating an isomorphic React app or by using prerendering.

Isomorphic React apps

In plain English, an isomorphic JavaScript application (or in our case, an isomorphic React application) can run on both the client side and the server side.

Thanks to isomorphic JavaScript, you can run the React app and capture the rendered HTML file that’s normally rendered by the browser. This HTML file can then be served to everyone who requests the site (including Googlebot).

On the client side, the app can use this HTML as a base and continue operating on it in the browser as if it had been rendered by the browser. When needed, additional data is added using JavaScript, as an isomorphic app is still dynamic.

An isomorphic app defines whether the client is able to run scripts or not. When JavaScript is turned off, the code is rendered on the server, so a browser or bot gets all meta tags and content in HTML and CSS.

When JavaScript is on, only the first page is rendered on the server, so the browser gets HTML, CSS, and JavaScript files. Then JavaScript starts running and the rest of the content is loaded dynamically. Thanks to this, the first screen is displayed faster, the app is compatible with older browsers, and user interactions are smoother in contrast to when websites are rendered on the client side.

Building an isomorphic app can be really time-consuming. Luckily, there are frameworks that facilitate this process. The two most popular solutions for SEO are Next.js and Gatsby.

  • Next.js is a framework that helps you create React apps that are generated on the server side quickly and without hassle. It also allows for automatic code splitting and hot code reloading. Next.js can do full-fledged server-side rendering, meaning HTML is generated for each request right when the request is made.

  • Gatsby is a free open-source compiler that allows developers to make fast and powerful websites. Gatsby doesn’t offer full-fledged server-side rendering. Instead, it generates a static website beforehand and stores generated HTML files in the cloud or on the hosting service. Let’s take a closer look at their approaches.

Gatsby vs Next.js

The challenge of SEO for GatsbyJS framework is solved by generating static websites. All HTML pages are generated in advance, during the development or build phase, and are then simply loaded to the browser. They contain static data that can be hosted on any hosting service or in the cloud. Such websites are very fast, since they aren’t generated at runtime and don’t wait for data from the database or API.

But data is only fetched during the build phase. So if your web app has any new content, it won’t be shown until another build is run.

This approach works for apps that don’t update data too frequently, i.e. blogs. But if you want to build a web app that loads hundreds of comments and posts (like forums or social networks), it’s better to opt for another technique.

The second approach is server-side rendering (SSR), which is offered by Next.js. In contrast with traditional client-side rendering, in server-side rendering, HTML is generated on the server, and then the server sends already generated HTML and CSS files to the browser. Also, with Next.js, HTML is generated each time a client sends a request to the server. This is vital when a web app contains dynamic data (as forums or social networks do). For SSR to work on React, developers also need to use the Node.js server, which can process all requests at runtime.

Server-side rendering with Next.js

The Next.js rendering algorithm looks as follows:

  1. The Next.js server, running on Node.js, receives a request and matches it with a certain page (a React component) using a URL address.

  2. The page can request data from an API or database, and the server will wait for this data.

  3. The Next.js app generates HTML and CSS based on the received data and existing React components.

  4. The server sends a response with HTML, CSS, and JavaScript.

    How Next.js works

Making website SEO Friendly with GatsbyJS

The process of optimizing React applications is divided into two phases: generating a static website during the build and processing requests during runtime.

The build time process looks as follows:

  1. Gatsby’s bundling tool receives data from an API, CMS, and file system.

  2. During deployment or setting up a CI/CD pipeline, the tool generates static HTML and CSS on the basis of data and React components.

  3. After compilation, the tool creates an about folder with an index.html file. The website consists of only static files, which can be hosted on any hosting service or in the cloud.

Request processing during runtime happens like this:

  1. Gatsby instantly sends HTML, CSS, and JavaScript files to the requested page, since they already were rendered during compilation.

  2. After JavaScript is loaded to the browser, the website starts working like a typical React app. You can dynamically request data that isn’t important for SEO and work with the website just like you work with a regular single-page React app.

    How Gatsby works

Creating an isomorphic app is considered the most reliable way to make React SEO-compatible, but it’s not the only option.

Prerendering

The idea of prerendering is to preload all HTML elements on the page and cache all SPA pages on the server with the help of Headless Chrome. One popular way to do this is using a prerendering service like prerender.io. A prerendering service intercepts all requests to your website and, with the help of a user-agent, defines whether a bot or a user is viewing the website.

If the viewer is a bot, it gets the cached HTML version of the page. If it’s a user, the single-page app loads normally.

This is how prerendering works

Prerendering has a lighter server payload compared to server-side rendering. But on the other hand, most prerendering services are paid and work poorly with dynamically changing content.

Using Gatsby: A demo project

To demonstrate how isomorphic React apps work, we’ve created a simple app for the Yalantis blog using Gatsby. We’ve uploaded the source code of the app to this repository on GitHub.

Gatsby renders only the starting page. Then the website works as a single-page app. To see how this isomorphic app works, just turn off JavaScript in your browser’s devtools. If you use Chrome, here are instructions on how to do that. If you refresh the page, the website should work just like it works with JavaScript turned on.

Our app with enabled and disabled JavaScript

Now let’s check the audit result using Lighthouse. Its performance is close to 100, while SEO for Gatsby.js applications is measured as high as 90.

Lighthouse results

If this website were on the production server, all indicators would be approximately 100.

The bottom line

Single-page React applications offer exceptional performance, seamless interactions close to those of native applications, a lighter server payload, and ease of web development.

Challenges with SEO shouldn’t be a reason for you to avoid using the React library. Instead, you can use the above-mentioned solutions to fight this issue. Moreover, search engine crawlers are getting smarter every year, so in the future, SEO optimization may no longer be a pitfall of using React.

https://yalantis.com/blog/search-engine-optimization-for-react-apps/

How To Use the JavaScript Fetch API to Get Data

Introduction

There was a time when XMLHttpRequest was used to make API requests. It didn’t include promises, and it didn’t make for clean JavaScript code. Using jQuery, you used the cleaner syntax with jQuery.ajax().

Now, JavaScript has its own built-in way to make API requests. This is the Fetch API, a new standard to make server requests with promises, but includes many other features.

In this tutorial, you will create both GET and POST requests using the Fetch API.

More

How To Use Axios with React

Introduction

Many projects on the web need to interface with a REST API at some stage in their development. Axios is a lightweight HTTP client based on the $http service within Angular.js v1.x and is similar to the native JavaScript Fetch API.

Axios is promise-based, which gives you the ability to take advantage of JavaScript’s async and await for more readable asynchronous code.

You can also intercept and cancel requests, and there’s built-in client-side protection against cross-site request forgery.

In this article, you will see examples of how to use Axios to access the popular JSON Placeholder API within a React application.

More

清华大学刘知远:自然语言理解难在哪儿?

在微博和知乎上关注自然语言处理(NLP)技术的朋友,应该都对**#NLP太难了##自然语言理解太难了#**两个话题标签不陌生,其下汇集了各种不仅难煞计算机、甚至让人也发懵的费解句子或歧义引起的笑话。然而,这些例子只是让人直觉计算机理解人类语言太难了,NLP到底难在哪里,还缺少通俗易懂的介绍。

More

Used Python to Collect Data from any Website

There are moments while working when you realize that you may need a large amount of data in a short amount of time. These could be instances when your boss or customer wants a specific set of information from a specific website. Maybe they want you to collect over a thousand pieces of information or data from said website. So what do you do?

One option could be to check out this website and manually type in every single piece of information requested. Or better yet, you could make Python do all the heavy lifting for you!

Utilizing one of Python’s most useful libraries, BeautifulSoup, we can collect most data displayed on any website by writing some relatively simple code. This action is called Web Scraping. In the next few parts, we will be learning and explaining the basics of BeautifulSoup and how it can be used to collect data from almost any website.

The Challenge

In order to learn how to use BeautifulSoup, we must first have a reason to use it. Let’s say that hypothetically, you have a customer that is looking for quotes from famous people. They want to have a new quote every week for the next year. They’ve tasked us with the job to present them with at least fifty-two quotes and their respective authors.

Website to Scrape

We can probably just go to any website to find these quotes but we will be using this website for the list of quotes. Now our customer wants these quotes formatted into a simple spreadsheet. So now we have the choice of either typing out fifty-two quotes and their respective authors in a spreadsheet or we can use Python and BeautifulSoup to do all of that for us. So for the sake of time and simplicity, we would rather go with Python and BeautifulSoup.

Starting BeautifulSoup

Let’s begin with opening up any IDE that you prefer but we will be using Jupyter Notebook. (The Github code for all of this will be available at the end of the article).

Importing Python Libraries

We will start by importing the libraries needed for BeautifulSoup:

from bs4 import BeautifulSoup as bs
import pandas as pd
pd.set_option(‘display.max_colwidth’, 500)
import time
import requests
import random

Accessing the Website

Next, we’ll have to actually access the website for BeautifulSoup to parse by running the following code:

page = requests.get(“http://quotes.toscrape.com/“)page# <Response [200]>

This returns a response status code letting us know if the request has been successfully completed. Here we are looking for Response [200] meaning that we have successfully reached the website.

Parsing the Website

Here we will be parsing the website using BeautifulSoup.

soup = bs(page.content)soup

Running this code will return what looks like a printed text document in HTML code that looks like this:

image-20211015185548456

We can navigate through the above, parsed document using BeautifulSoup.

Navigating the Soup

Now we will need to find the exact thing we are looking for in the parsed HTML document. Let’s start by finding the quotes.

An easy way to find what we are looking for is by:

  • Going to the webpage and finding the desired piece of information (in our case, the quotes).
  • Highlight that piece of information (the quote)
  • Right click it and select Inspect

image-20211015185713695

This will bring up a new window that look like this:

image-20211015185759078

The highlighted section is where we will find the quote we are looking for. Just click the arrow on the left of the highlighted section to see the quote in the code.

HTML Information for Navigation

Based on the HTML code we see highlighted we can use that information to navigate the soup. We will be using the .find_all() attribute in our own code to potentially find the quotes we are looking for. This attribute will be able to return to us the desired line (or lines) of code based on whatever arguments we give it. Since we can see that the HTML code for the quote contains class=“text”, we can use that in our BeautifulSoup code:

soup.find_all(class_=’text’)

Running this code will return the following results:

image-20211015185836032

From this we can see that we are able to successfully locate and retrieve the code and text containing the quotes needed.

In order to only retrieve the text and exclude the unnecessary code, we will have to use the .text attribute in each result. To do so, we will have iterate through the list using a “for” loop:

quotes = [i.text for i in soup.find_all(class_=’text’)]quotes

This will give us the list of quotes without their respective HTML code:

image-20211015185905217

Now we know how we can access the quotes within the website and retrieve them for our purposes. Repeat the steps mentioned before to retrieve the author names for each quote as well:

authors = [i.text for i in soup.find_all(class_=’author’)]

Accessing Multiple Pages

Now that we know how to retrieve data from a specific webpage, we can move on to the data from the next set of pages. As we can see from the website, all the quotes are not stored on a single page. We must be able to navigate to different pages in the website in order to get more quotes.

Notice that the url for each new page contains a changing value:

Knowing this we can create a simple list of URLs to iterate through in order to access different pages in the website:

urls=[f”http://quotes.toscrape.com/page/{i}/“ for i in range(1,11)]urls

This returns a list of websites we can use:

image-20211015185942865

From this list, we can create another “for” loop to collect the necessary number of quotes and their respective authors.

Avoiding Web Scraping Detection

One important thing to note: some websites do not approve of web scraping. These sites will implement ways to detect if you are using a web scraping tool such as Beautiful Soup. For example, a website can detect if a large number requests were made in a short amount of time, which is what we are doing here. In order to potentially avoid detection, we can randomize our request rate to closely mimic human interaction. Here is how we do that:

Generate a list of values:

rate = [i/10 for i in range(10)]

Then at the end of every loop, input the following piece of code:

time.sleep(random.choice(rate))

Here we are randomly choosing a value from the list we created and waiting that selected amount of time before the loop begins again. This will slow down our code but will help us avoid detection.

Bringing it all Together

Now that we have all the pieces, we can construct the final “for” loop that will gather at least 52 quotes and their respective authors:

The Entire Code to Retrieve at least 52 Quotes and their Authors

Once we run the code above we will end up with a list of quotes and a list of authors. However, our customer wants the quotes in a spreadsheet. To accommodate that request, we will have to use the Python library: Pandas.

Inputting the lists into a Pandas DataFrame is very simple:

# Creating a DataFrame to store our newly scraped information
df = pd.DataFrame()# Storing the quotes and authors in their respective columns
df[‘Authors’] = authorsdf[‘Quotes’] = quotes

Since the quotes and the authors were scraped in order, they will be easy to input into the DataFrame.

Once we have finished and ran the code above, the final DF will look like so:

image-20211015190016456

Excellent! The DF looks great and is in the exact format requested from the customer. We can then save the DF as an excel spreadsheet file which we can then send to our customer.

Closing

We hope you learned a little bit about web scraping from this step-by-step tutorial. Even though the example we used may be quite simple, the techniques used will still be applicable to many different websites all over the internet. More complex looking websites that require extensive user interaction will require another Python library called Selenium. However, most websites will only need BeautifulSoup to scrape data. The walkthrough we have done here should be enough to get you started. Happy scraping!