:: AsciiDoc

1. Reproducibility

1. Reproducibility

Reproducibility：trying to take random out of random；
计算机从跟不上是确定性deterministic，即每个步骤都是可预测的predictable，
故产生的随机性randomness可简单认为是模拟随机性，或伪随机性pseudorandomness；
神经网络与深度学习：神经网络从随机数开始描述数据中的模式，
并试图使用张量运算来改进这些随机数，以更好的描述数据中的模式pattern；
随机性很好很强大，但有时随机性尽可能少，以便进行可重复实验或
再现性reproducibility，以下是再现性案例：

# create two random tensor
tensor_ra = torch.rand(3,4)
tensor_rb = torch.rand(3,4)

print(f"Tensor A:\n{tensor_ra}\n")
print(f"Tensor B:\n{tensor_rb}\n")
print(f"Tensor A equal Tensor B?")
tensor_ra == tensor_rb

# create two random tensor
tensor_ra = torch.rand(3,4)
tensor_rb = torch.rand(3,4)

print(f"Tensor A:\n{tensor_ra}\n")
print(f"Tensor B:\n{tensor_rb}\n")
print(f"Tensor A equal Tensor B?")
tensor_ra == tensor_rb

如上，我们创建两随机张量，并期望它们是不同的，若想创建两个具有相同值的随机张量，
例如(as in)，张量仍包含随机值，但具有相同的flavour(特点)；
即torch.manual_seed(seed)的作用所在，seed是整数，如42，可增加随机性；

import torch
import random

# set random seed,更改seed，并观察数字会发生啥
RANDOM_SEED = 44
torch.manual_seed(seed=RANDOM_SEED)
tensor_rc = torch.rand(3,4)

# 每次调用新的rand()时都必须重置种子(reset seed),
# 否则tensor_rd将与tensor_rc不同,尝试把此行注释看会发生啥
torch.random.manual_seed(seed=RANDOM_SEED)
tensor_rd = torch.rand(3,4)

print(f"Tensor C:\n{tensor_rc}\n")
print(f"Tensor D:\n{tensor_rd}\n")
print(f"Tensor C equal Tensor D?")
tensor_rc == tensor_rd

Tensor C:
tensor([[0.7196, 0.7307, 0.8278, 0.1343],
        [0.6280, 0.7297, 0.2882, 0.2112],
        [0.9836, 0.8722, 0.9650, 0.7837]])

Tensor D:
tensor([[0.7196, 0.7307, 0.8278, 0.1343],
        [0.6280, 0.7297, 0.2882, 0.2112],
        [0.9836, 0.8722, 0.9650, 0.7837]])

Tensor C equal Tensor D?

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])