Search is dying, use ChatGPT plugins.
2023-03-25
Articles
469
Many people are saying that AI powered search will be the future. It is not. With the announcement of ChatGPT plugins, if you are still thinking that search will be important in the future, then you are still in the old mindset.
Search is only a bridge between the old fashion web interface, and the new AI powered language interface. Search gives AI access to the traditional web. But what will really empower AI is the plugin based language interface. One important component that will play a key role is the information retrieval plugin open sourced by OpenAI.
Imagine you are a website owner, you could sit still and let AI powered search to crawl and understand your website. Or you can provide your own plugin that makes use of the information retrieval plugin provided by OpenAI, to serve your well-indexed contents when ChatGPT decided to use your plugin. The former one is passive, and the later one is active. Search is non-deterministic,
Tech Dec 2022
2022-12-10
Tech
748
12月10日 土曜
前は日記に置いてるが、やはりテックのものを別記事にまとめます。
Stable Diffusion
Stable Diffusionを安定で高質の絵を出力させるには呪文が必要らしい。この文章を参考しました。最初はプロンプトだけを使ってるが、顔や体が歪んでしまいがち。ネガティブプロンプトがあることを思い出して、試しに入れてみたら歪みが大分治まった。
現時点で一番いいプロンプト。
--prompt "${normal_prompt_content} \
with fantastic lighting fantastic composition \
in high production quality" \
--negative-prompt "twisted face twisted body closeup view"
一部の出力。
銃を持つ学校制服を着てる少女
--prompt "a cute anime girl in school skirt uniform holding a gun \
with fantastic lighting fantastic composition \
in PlayStation5 octane render style" \
--negative-prompt "twisted face twisted body closeup view"
白い
AtCoder August 2022
2022-08-16
ACM
1157
ABC264 on 13rd
E. Blackout 2
Problem: https://atcoder.jp/contests/abc264/tasks/abc264_e
問題文
ある国には N 個の都市と M 個の発電所があります。これらを総称して地点と呼びます。
地点には 1,2,…,N+M の番号がつけられており、そのうち都市は地点 1,2,…,N で発電所は地点 N+1,N+2,…,N+M です。
この国には電線が E 本あり、電線 i ( 1≤i≤E ) は地点 Ui と地点 Vi を双方向に結びます。
また、ある都市に 電気が通っている とは、ある都市から電線をいくつか辿って少なくともひとつの発電所に辿り着くことができる状態を言います。
今、 Q 個のイベントが起こります。そのうち i ( 1≤i≤Q ) 番目のイベントでは電線 Xi が切れ、その電線を辿ることができなくなります。一
Review of Linear Algebra
2021-10-20
Math
336
Matrix
Basic Operations
Matrix Product (Dot Product)
Matrix Division
Matrix Multiplication (Hadamard Product)
Matrix Operations
Matrix Transpose
Matrix Inversion
Matrix Trace (the sum of the main diagonal elements)
Matrix Determinant
Matrix Rank (the maximum number of linearly independent columns)
Matrix Factorization (?)
LU Decomposition
QR Decomposition
Eigendecomposition
Singular Value Decomposition
Ref: Linear Algebra Cheat Sheet Python Code
Others:
Unitary Matrix
A matrix that satisfies
Minor Matrix
Adjugate Matrix
For square matrix A, , , where is the (i, j) minor of A.
Conjugate Transpose
, for complex number ,
Orthogonal Matrix
A matrix that satisfy
Matrix Product
Matrix product is produced by taking i-th row of the first matrix and j-th column of the second matrix and calculate dot product of the two vectors as the (i, j) element in the result matrix.
Given
then,
Matrix Division
Matrix Division makes use of the following property:
where
NLP with Deep Learning
2021-09-09
Machine Learning
332
Transformer
Attention is all you need - Arxiv
Feed Forward: Two fully connected layers (Linear) with a ReLU activation in between.
Multi-head Attention:
Attention
Attention - Qiita
Self-Attention
GPT (Generative Pre-Training)
GPT - paper
BERT (Bidirectional Encoder Representations from Transformers)
BERT - Arxiv
BERT Explained
Different from the Transformer (GPT) which is trained using only the left context, BERT uses bidirectional encoder which makes use of both the left and the right context. [MASK] is used to mask some of the words so that the model will not see the word itself indirectly.
Pre-training of BERT makes use of two strategies: MLM (Masked Language Model) and NSP (Next Sentence Prediction). The model is trained with both the strategies together.
As shown below, the input embeddings of BERT consists of the token embeddings, the segment embeddings, and the position embeddings. Note that a segment may consists of multiple sentences.
In MLM task,
HomeVR Dev Log
2020-11-09
Tech
589
Nov 04
WebVR requires HTTPS on Oculus Quest.
WebVR seems to require enabling Universal PR (Will cause some issues)
Universal PR supports ShaderGraph, which is good. But it does not support point light shadows, which is bad.
By default, Universal PR only enable shadows for the main light. Go to UniversalRenderPipelineAssset > Lighting > Additional Lights, enable Cast Shadows. You may also want to enable Shadows > Soft Shadows.
The PostProcessing Volume for Universal PR is Create > Volume > Global Volume. Some suggest creating a empty GameObject and add PostProcessingVolume to it. Do not work for me. (Maybe it is because the Volume Mask is set to Default)
Nov 09
Unity do not have Mirror support for build-in pipeline and URP. HDRP has planar reflection. To make planar reflection work in URP, you may use some tricks like reflection probe or simple reflection camera logics. But these only work for small planar reflection. For larger mirror, they are broken
Followers
2020-07-16
Articles
587
一开始我是被 PV 吸引了。有点意思,我想。女主有点酷;虽然 PV 完全意味不明,但是就是很戳。大概是因为色彩主题和几分前卫的内容。
第一集看下来,女主(似乎中文翻译应该是叫夏目?)果然很酷,而且穿衣也很潮。(虽然她那个画家朋友比女主还拽就是了。)而且女主手机,屏幕碎了都舍不得换,不得不说细节简直了。后来看到说是演员自己觉得这样的细节更符合女主的人设。天,秒粉了(不是)。而放在写这篇评论的现在,我的手机屏幕也碎了 n 久都还不想换,就更有共鸣了(笑)。
如果单看女主的线,其实剧情还是蛮套路的;但却不乏味。而这部剧的其实不止一个主角。摄影师奈良和她的朋友们也占了同等重要的戏份。这是一部以多名女性为主角,藉此反映社会现象和新文化的番剧。这部剧融合的文化元素可谓是前卫的大杂烩。男主是一名 Youtuber ;
Tic-Tac-Toe Online
2019-10-10
Tech
252
Tic-Tac-Toe Online Server
Base on the Tic-Tac-Toe Game of CS188, Berkeley, I develop an online version of Tic-Tac-Toe. Now your agent can play with my agent online!
I think it is a good way to check whether our agents are optimal or not.
My agent can beat random agents most of the time even if my agent is the second player.
Online Server Website: Tic-Tac-Toe Online
Download the attached client file from the moodle form and place it in the same directory of your solveTicTacToe.py (solveTicTacToe.py depends on util.py , so util.py is also needed).
Run this command:
$ python3 TicTacToeOLClient.py -u demo -n 3
And Enjoy!
Notice: You need to specify a username with "-u USERNAME". Don't use "demo" as your username cause it is forbidden.
Usage: TicTacToeOLClient.py [options]
Options
Variable
Description
-u
USERNAME
Username, must not be empty nor
Shenyang2019.icpc D Fish Eating Fruit
2019-09-18
ACM
419
The Question
Fish eating fruit on jisuanke
Given an undirected acyclic graph G, all possible path P in the graph, calculate:
The first taste
In the contest, a handsome foriegn teammate conviced me that this problem can be solve using LCA. I tried. And it did work, with the help of dijkstra.
My solution is to, first of all, run dijkstra, and get the distance between root node and every other nodes. Then, calculate the LCA for every two nodes. The desired result is:
It worked, but we got TLE for looping though all the nodes, which is .
The second trial
After the contest, I was told that this is a DP problem. You calculate the times an edge is accessed, times it with the weight, sum them up by the modulus of 3, you got the result.
This one, however, also got TLE. Oh, FISH!
The final solution
The reason why the second solution still can
R-CNN
2019-07-07
Computer Vision
2671
Overview
Regions with CNN features:
Efficient Graph Based Image Segmentation
use disjoint set to speed up merge operation
Selective Search
HOG (Histogram of Oriented Gradient)
Multiple criterions (color, texture, size, shape) to merge regions
AlexNet/VGG16
R-CNN
Notice that many descriptions are replicated from the orignal sources directly.
Some Fundermental Conceptions
Batch Size
Stochastic Gradient Descent. Batch Size = 1
Batch Gradient Descent. Batch Size = Size of Training Set
Mini-Batch Gradient Descent. 1 < Batch Size < Size of Training Set
Regularization
A regression model that uses L1 Regularization technique is called Lasso Regression and model which uses L2 is called Ridge Regression.
Ridge Regularization
Ridge regression adds "squared magnitude" of coefficient as penalty term to the loss function.
The first sum is an example of loss function.
Lasso Regularization
Lasso Regression (Least Absolute Shrinkage and Selection Operator) adds "absolute value of magnitude" of coefficient as penalty term to the loss function.
SIFT
2019-07-03
Computer Vision
389
Useful Materials
Distinctive Image Features from Scale-Invariant Keypoints[1] by David G. Lowe.
SIFT(Scale-Invariant Feature Transform)[2] on Towards Data Science.
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor[3].
Notes
Uses DoG (Difference of Gaussian) to approximate Scale-normalized LoG (Laplacian of Gaussian)[4].
where is the two dimensions Gaussian function, and is the input image.
[need more consideration] After each octave, the Gaussian image is down-sampled by a factor of 2, by resampling the Gaussian image that has twice the initial value of by taking every second pixel in each row and column. And we start on the new octave with .
Since the image size is reduced to 1/4, the sigma for the next octave becomes , which is equal to .
To understand it, frist consider this question: If the image size is reduced to 1\4, but the kernel size of
Image Processing - Noise and denoise
2019-03-20
Computer Vision
466
Types of Noise
Additive noise
Additive noise is independent from image signal. The image g with nosie can be considered as the sum of ideal image f and noise n.[1]
Multiplicative noise
Multifplicative noise is often dependent on image signal. The relation of image and noise is[1]:
Gaussian noise
Gaussian noise, named after Carl Friedrich Gauss, is statistical noise having a probability density function (PDF) equal to that of the normal distribution, aka. the Gaussian distribution. i.e. the values that the noise can take on are Gaussian-distributed.
The PDF of a Gaussian random variable is given by[2]:
Salt-and-pepper noise
Fat-tail distributed or "impulsive" noise is sometimes called salt-and-pepper nosie or spike noise. An image containing salt-and-pepper noise will have dark pixels in bright regions and bright pixels in dark regions.[2]
The PDF of (Bipolar) Impulse noise is given by:
if b > a, gray
Greek Alphabet
2019-03-19
Stuff
81
Greek Alphabet
Letter (Capital Case)
Letter (Lower Case)
Name
English Equivalent
alpha
a
beta
b
gamma
g
delta
d
epsilon
e
zeta
z
eta
h
theta
th
iota
i
kappa
k
lambda
l
mu
m
nu
n
xi
x
omicron
o
pi
p
rho
r
sigma
s
tau
t
upsilon
u
phi
ph
chi
ch
psi
ps
omega
o
References
[1]. Greek Alphabet - Wikipedia
[2]. Greek alphabet letters & symbols
Decision Tree
2019-03-16
Machine-Learning
1037
Information Content
where,
I : the information content of X. An X with greater I value contains more information.
P : the probability mass function.
: b is the base of the logarithm used. Common values of b are 2 (bits), Euler's number e (nats), and 10 (bans).
Entropy (Information Theory)[1]
where,
H : the entropy H. Named after Boltzmann's H-theorem (but the definition is proposed by Shannon). H indicates the uncertainty of X.
P : probability mass function.
I : the information content of X.
E : the expected value operator.
The entropy can explicitly be written as:
ID3[2]
Use ID3 to build a decision tree:
Calculate the entropy of the samples under the current node.
Find a feature F that can maximize the information gain. The information gain is calculatd by:
where E is the entropy of
AutoTag
2019-03-12
Tech
364
AutoTag
AutoTag is a program that generate tags for documents automatically. The main process includes:
Participle (N-gram + lookup in dictionary)
Generate bag-of-words for each document.
Calculate term frequency and inverse document frequency.
Pick top x words with greater tf-idf values as tags.
N-gram
N-gram generate a sequence of n words in every position of a sentence.[1]
sentences = 'Lucy like to listen to music. Luna like music too.'
items = ngram(sentences, 2)
print(items)
# output:
[
'Lucy like',
'like to',
'to listen',
'listen to',
'to music',
'Luna like',
'like music',
'music too',
]
Bag of words
The bag-of-words model is a simplifying representation in NLP and IR.[1]
N-gram
Count the times that each word appears
w3m
2019-03-01
Tech
104
w3m: WWW wo Miru
(c) Copyright Akinori ITO
w3m is a pager with WWW capability. It IS a pager, but it can be used as a text-mode WWW browser.
Keyboard Shortcuts
Shortcut
Action
Level
H
Help
Brower
q
Quit w3m
Brower
C-h
History
Brower
T
New tab
Tabs
C-j
Open link on current tab
Tabs
C-t
Open link on new tab
Tabs
C-q
Close tab
Tabs
U
Go to url
Page
R
Reload
Page
B
Back
Page
Configuration
File
Location
Keymap file
~/.w3m/keymap
Remember Minori
2019-02-28
Articles
148
十年时间转瞬即逝,许多曾经深爱的事物都一一离去了,不免让人生出逝者如斯的感慨。
说来有些讽刺,作为一个计算机的学生,理应走在前沿,我却总怀念老式的交互和十年前的网络。
人生得意须尽欢,莫使金樽空对月。
如同那句台词所说,我们总要不断地告别。和老去的事物告别,和过去告别。至少在最后道一声感谢,笑着告别。
More Blogs
2019-02-12
Misc
64
Friends' Blogs
EquimA Kernel Implementation
長門有希An Outstanding ACMer
Bug ZhangA Great Web Master
Mr. ZhuA Dreamer
Favorite Blogs
CHARON(カロン)同人サークル「CHARON」の公式サイト
海底囚人海底囚人の趣味の個人サイトです
Mount NTFS in Linux and Compatible with Window
2019-02-04
Tech
682
Permission Control for NTFS
We often encounter the problem that to mount NTFS under Linux means no permission control. But that is not true.
According to JanC's Answer on AskUbuntu:
Contrary to what most people believe, NTFS is a POSIX-compatible filesystem, and it is possible to use permission on NTFS.
The First Trial
First let's just open /etc/fstab and see how partitions are mounted.
$ sudo nano /etc/fstab
In my situation, the NTFS partitions are mounted as followed:
/dev/sda1 /mnt/NTFS1 auto nosuid,nodev,nofail,x-gvfs-show 0 0
/dev/sda2 /mnt/NTFS2 auto nosuid,nodev,nofail,x-gvfs-show 0 0
The nosuid keyword prevents setting uid on filesystem. So the First step is to remove this keyword.
However, once this keyword is removed, an uid and a gid must be given to setup the permission control. By default, the current uid and the current gid will be used. We can also specifiy
iptables basic
2018-12-27
Tech
171
List
# list filter table
sudo iptables -L
# list nat table
sudo iptables -L -t nat
Redirect
# Redirect locally
sudo iptables -A OUTPUT -t nat -p tcp --src 127.0.0.1 --dport 80 -j REDIRECT --to-port 8080
# Redirect in LAN
sudo iptables -A PREROUTING -t nat -p tcp --src 10.42.0.0/24 --dst 10.42.0.1 --dport 80 -j REDIRECT --to-port 8080
Filter
# Reject with ICMP-port-unreachable.
sudo iptables -A OUTPUT --dst www.bing.com -j REJECT
# Drop and hang up the connection.
sudo iptables -A OUTPUT --dst www.bing.com -j DROP
Package flow paths
Package flow paths (from iptables - Wikipedia):
Piano Basic
2018-12-24
Stuff
203
Major and Minor
Step
The step between each note below is a half:
C C#(Db) D D#(Eb) E F F#(Gb) G G#(Ab) A A#(Bb) B C
Steps in Major and Minor
S: a half step
T: a whole step
E: one and a half step
Steps in Major: TTSTTTS
Steps in (Natural) Minor: TSTTSTT
Steps in Harmonic Minor: TSTTSES (introducing the tension resolution sound. Create a leading tone.)
Steps in Melodic Minor: TSTTTTS (eliminates the gap while keeping the leading tone sound.)
Majors
E Major: E F# G# A B C# D#
C Major: C D E F G A B
Piano Piece / Sheet
Notations:
The Treble Clef:
The Bass Clef:
C scale:
Circle of fifths: (source: C major - Wikipedia)
More at How To Read Sheet Music: Step-by-Step Instructions
More
Major scale - Wikipedia
Minor Scales - Natural, Harmonic
Bash Basic
2018-11-27
Tech
730
Substitution, dirname, basename and suffix
Substitution can be used to get path and short filename.
filename=a/b/c/name.file
echo ${filename#*/} # b/c/name.file
echo ${filename##*/} # name.file
echo ${filename%/*} # a/b/c
echo ${filename%%/*} # a
But in fact, there is better way:
filename=a/b/c/name.file
echo $(dirname $filename)
echo $(basename $filename)
Still, substitution is useful for getting filename without suffix or getting suffix.
filename=file.name.type
echo ${filename%.*} # file.name
echo ${filename##*.} # type
Parameter Expansion
String selection:
string=1234567890abcdefg
echo ${string: 10} # abcdefg
echo ${string: 10:5} # abcde
echo ${string: 7:-4} # 890abc
Array selection:
arr=(0 1 2)
arr=($
ffmpeg basic
2018-11-26
Tech
411
Input and Output
$ ffmpeg -i input.mp4 output.mp4
Cutting Video or Audio
Begin and length:
$ ffmpeg -i input.mp4 -ss 00:00:10 -t 01:00:10 output.mp4
Beginning time and ending time:
$ ffmpeg -i input.mp4 -ss 01:00 -to 10:00 -c copy output.mp4
"-c" is short for "-codec". Use "-codec:a" or "-codec:v" to specify audio or viedo respectively.
"-c copy" suggests that the data will be copy directly to output (rather than being converted in some cases).
Video Converting
$ ffmpeg -i input1.flv input2.flv -framerate 30 -f mp4 -vf "scale=1280x720" output.mp4
$ ffmpeg -i input1.flv input2.flv -framerate 30 -f mp4 -vf "crop=${w}:${h}:${x}:${y}" output.mp4
"-framerate 30" fixes the framerate to 30 fps.
"-f mp4" before
1Q84
2018-10-31
Excerpt
5803
“另外”司机对着后视镜说,“有一件事想請您记住: 事物往往和外表不一样。"
事物往往和外表不一样。青豆在脑中重复了一遍, 微颦眉尖。“什么意思?"
司机字斟句酌地说:“就是说,现在您要去做一件非同一般的事, 不是吗? 大白天从首都高速公路的避难阶梯爬下去, 这样的事普通人一般不会做。女人尤其不会。”
“大概吧。”青豆说。
“那么,一旦做了这样的事,往后的日常风景,该怎么说呢,看上去也许会和平常有点不一样,我也有过这样的经验。但是,不要被外表迷惑。现实永远只有一个。"
“你知道吗,才华和直觉最大的区别是什么?”
“不知道。”
“区别就在于,你再怎么才华横溢,也未必能填饱肚皮;但只要你拥有敏锐的直觉,就不必担心混不上饭吃。”
小松作为编辑,直觉中的确有种特别的东西。他永远不会迷
Setup Tensorflow GPU
2018-08-26
Tech
538
0. install GPU drivers
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-390
1. install cuda tookit and cudnn SDK (and CUPTI)
Following the instructions at Installing Tensorflow on Ubuntu.
# Adds NVIDIA package repository.
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
sudo dpkg -i nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
sudo apt-get update
# Includes optional NCCL 2.x.
sudo apt-get install cuda9.0 cuda-cublas-9-0 cuda-cufft-9-0 cuda-curand-9-0 \
cuda-cusolver-9-0 cuda-cusparse-9-0 libcudnn7=7.1.4.18-1
Convolutional Neural Network
2018-08-19
Machine-Learning
333
This article is my reflection on my previous work FaceLock, a project to recognize user's face and lock the computer if the user doesn't present in a certain time. CNN is used to recognize different faces. I watch the Coursera course Convolutional Neural Networks by Andrew Ng to understand more about CNN, so it's also a learning note about it.
One Layer of a Convolutional Network
In a non-convolutional network, we have the following formula:
Similarly, in the convolutional network, we can have:
@ is a convolution operation.
@ is the input matrix.
@ is the filter. Different filter can detect different feature, e.g. vertical edge, diagonal edge, etc.
@ is the bias.
@ is a activation function.
@ is the output matrix, and can be fed to the next layer.
Calculating the Number
The Number of the Parameters
Suppose we have 10 filters which are in one layer of a neural
Mathematical Basis - Squashing Function
2018-08-11
Machine-Learning
444
This article is about some squashing functions of deep learning, including Softmax Function, Sigmoid Function, and Hyperbolic Functions. All of these three functions are used to squash value to a certain range.
Softmax Function
Softmax Function: A generalization of the logistic function that "squashes" a K-dimensional vector z of arbitrary real values to a K-dimensional vector of real values, where each entry is in the range (0, 1], and all the entries add up to 1.
In probability theory, the output of the softmax function can be used to represent a categorical distribution - that is, a probability distribution over K different possible outcomes.
The softmax function is the gradient of the LogSumExp function.
LogSumExp Function
LogSumExp Function: The LogSumExp(LSE) function is a smooth approximation to the maximum function.
( stands for the natural logarithm function, i.e. the logarithm to the base e.)
When directly encountered, LSE can be well-approximated by :
Sigmoid
Recurrent Neural Network
2018-07-30
Machine-Learning
388
This article is my learning note of the coursera course Sequence Models by Andrew Yan-Tak Ng.
There are two typical RNN units of the hidden layers of the RNN according to Andrew Ng. One is GRN (Gated Recurrent Unit), the other is LSTM (Long Short-Term Memory).
Notice: Please refer to Mathematical Basis - Squashing Function for some basic math knowledge about the squashing functions.
GRN - Gated Recurrent Unit
The GRN is a gating mechanism in recurrent neural networks, introduced in 2014 by Kyunghyun Cho et al.
The fully gated version :
The formulas :
@ : The memory cell.
@ : The input sequence.
@ : The output sequence.
@ : Gate gamma r. It tells us how relevance is to computing the next candidate for .
@ : Gate gamma u. The update gate vector. Decide whether or not we actually update , the memory cell.
@ : The candidate value for the memory
3D - Transformation and Lighting
2018-05-06
Tech
2332
Transformation
Defines a struct ObjectProperty as follow:
property
type
comment
rotation
vec3
The object's rotation in its object center.
scale
vec4 / vec3
The object's scale level in its object center.
translation
vec3
The object's relative position to the origin (absolute position).
face
vec3
Where the object look at.
The code:
struct ObjectProperty
{
vec3 rotation;
vec4 scale;
vec3 translation;
vec3 face;
};
Object Transformation
Object Transformation Order:
-> Rotate -> Scale -> Translate
Rotate and scale around the center of the object, so that the order of these transformations will not influence the result.
The transformation matrix is:
struct ObjectProperty objProp;
mat4 m4ModelMatrix = translate(objProp.translation) * scale(objProp.scale) * rotate(objProp.rotation);
The transformation operations are:
vec4 v4PosInModel = m4ModelMatrix * vertex.position;
Where vertex.position is a vec4 object containing the position of the
Quicksort
2018-04-21
Algorithm
2023
Time Complexity
With partition, the elements compared each time are as follow:
n, n/2, n/2, n/4, n/4, n/4, n/4, ..., 1, 1, ..., 1 (n ones)
And their sum is:
n, n, n, ..., n
But how much n?
Since n is divided to 1 by 2, there sure is log2(n) number of n.
So the time complexity is: n*log2(n)
Unique-elements sorting implement
From Quicksort - wikipedia:
algorithm quicksort(A, lo, hi) is
if lo < hi then
p := partition(A, lo, hi)
quicksort(A, lo, p)
quicksort(A, p + 1, hi)
algorithm partition(A, lo, hi) is
pivot := A[lo]
i := lo - 1
j := hi + 1
loop
Cpp Master
2018-04-03
Tech
1260
以下皆为在开发我的 C++ 库 donnylib 的时候遇到的问题。 (一天遇到这么多 weird problems 也是很走运了)
donnylib : https://github.com/Donny-Hikari/donnylib
Weird Template Subclass
如下的程序会导致无法自动推断模板类型。(编译环境: g++ 5.4.0, -std=c++17)
template<class T>
struct FOO
{
struct NOT
{
NOT() { }
};
FOO() { sizeof(T); }
};
template<class T>
typename FOO<T>::NOT foo(typename FOO<T>::NOT a)
{
return a;
}
void test1()
{
FOO<int>::NOT f;
foo(f); // template argument deduction/substitution failed.
}
加上下面的代码也是不行的。
template<class T>
using NOT = typename FOO<T>::NOT;
template<class T>
NOT<T> foo(NOT<T
AdaBoost
2018-03-30
Machine-Learning
1004
Python 实现: AdaBoost - Donny-Hikari - Github
Introduction
AdaBoost 是 Adaptive Boosting 的简称。 Boosting 是一种 Ensemble Learning 方法。 其他的 Ensemble Learning 方法还有 Bagging, Stacking 等。 Bagging, Boosting, Stacking 的区别如下:
Bagging:
Equal weight voting. Trains each model with a random drawn subset of training set.
Boosting:
Trains each new model instance to emphasize the training instances that previous models mis-classified. Has better accuracy comparing to bagging, but also tends to overfit.
Stacking:
Trains a learning algorithm to combine the predictions of several other learning algorithms.
The Formulas
Given a N*M matrix X, and a N vector y, where N is the count of samples, and M is the features of samples. AdaBoost trains T weak classifiers with the following steps:
给定一个N*M的矩阵X(特征),和一个N维向量y(标签),N为样本数,M为特征维度。AdaBoost以一下步骤训练T个弱分类器:
Android Library and Proguard
2018-01-11
Tech
367
Android Library
To build a android library project, simply change the following line in build.gradle (Module):
apply plugin: 'com.android.application'
to:
apply plugin: 'com.android.library'
To pack the library into a jar file, add these lines to build.gradle (Module):
def JarLibName="YourLibName"
task clearJar(type: Delete) {
delete 'build/generated/' + JarLibName + '.jar'
}
task makeJar(type: Jar) {
from zipTree(file('build/intermediates/bundles/release/classes.jar'))
baseName = JarLibName
destinationDir = file("build/generated")
}
makeJar.dependsOn(clearJar, build)
And run the following command in the Terminal:
./gradlew makeJar
If failed in lint process, add these to build.gradle (Module):
android {
// ...
lintOptions {
abortOnError false
}
// ...
}
Remember to copy the jar library file
Classification And Overfitting
2017-11-20
Machine-Learning
658
This is a learning note of Logistic Regression of Machine Learning by Andrew Ng on Coursera.
Hypothesis Representation
Uses the "Sigmoid Function," also called the "Logistic Function":
Which turn linear regression into classification.
Sigmoid function looks like this:
give us the probability that the output is 1.
In fact, is simplified as
for logistic regression, and is
for linear regression. In some complicated case, z might be something like:
Decision Boundary
Decision boundary is the line (or hyperplane) that separates the area where y = 0 and where y = 1 (or separates different classes). It's created by our hypothesis function.
The input to the sigmoid function is not necessary to be linear, and could be a function that describes a circle (e.g. ) or any shape to fit the data.
Cost Function
Using the cost function for linear regression in classification will cause the output to be wavy, resulting in many local optima.
VJ193259 C Asa's Chess Problem
2017-10-29
ACM
2192
Concept
上下界网络流
顾名思义,比普通的网络流多出了下界。
考虑将下界的流量限制进行转化。将原来的边(e, v1->v2)拆边,拆出一条上界为 max-min 的边(e1),和一条上界为 min 的边(e2, 虚拟存在)。 e2 需要满足满流方才有解,但满流的同时还需要保证流量守恒(入流==出流)。故而构造虚拟源点 SS 和虚拟汇点 TT .
对于点v, 若:
下限入流等于下限出流,则两者平衡,可以认为这两个流出流入的流量都不存在。
下限入流(flowin)大于下限出流(flowout),则 SS 向 v 连一条容量为 flowin-flowout 的边,弥补不足流量。
下限入流(flowin)小于下限出流(flowout),则 v 向 TT 连一条容量为 flowout-flowin 的边,弥补过剩流量。
当且仅当 SS 的所有边都满流时有解。跑一遍 SS 到 TT 的最大流,然后验证即可。
有源汇上下界可行
HDU-5383 - Yu-Gi-Oh! - 最小费用网络流
2017-10-27
ACM
2419
Problem
合怪升级,每只怪有他的等级和攻击力,以及是否为 tuner monster. 合怪必须一只 tuner monster 与一只 non-tuner monster 合,并且他们的等级之和必须等于合成的怪。只有特定等级和攻击力的怪可以被合成。某些合成怪必须用特定的某一只或某两只怪才能合成。
Solution
看懂题目后建图跑网络流。主要难点在建图。
将 tuner monster(t1) 连到源点,将 non-tuner monster(t2) 连到汇点, capacity 均为 1, cost 均为0。并且列举将该怪与所有另一种怪的组合,以合成等级为索引保存。注意最大等级上限为12,超过12的组合直接 pass 掉。
然后处理合成怪。在合成怪对应等级中搜索对应满足条件的组合,加入到 t1, t2 矩阵中。这里有一个必须搞的优化,就是只选攻击力最大且大于两只被合成怪的合成怪,对满足的 t1,t2 连 capacity 为 1, cost 为 t1 攻击力 加 t2 攻
莫比乌斯反演
2017-10-25
Algorithm
336
Introduction
最精彩的证明是自动证明。今天在知乎上看到的一位大佬对莫比乌斯反演的证明让我体会到了数论的神奇。坚定了我投靠数学的决心。
Mobius Inversion Formula
若存在
则有
反之亦然。
Basic Concept
1. Convolution / 卷积
对两个数论函数, , 卷积运算定义为
卷积运算满足:
交换律:
结合律:
存在单位元. . 不难知道定义应为:
2. 乘法单位元
普通乘法意义上的单位元,将所有自然数映射到1的函数,记为。
3. 莫比乌斯函数
在卷积意义下的逆元,称为莫比乌斯函数。即
展开写作:
定义为:
在知乎上大神引用的定义是:
这两种定义是等价的。前者是英文 Wikipedia 上给出的定义。后者可能比较好理解一些。
Prove
莫比乌斯反演写作卷积形式:
证明:
反之
Reference
参考资料:
[1] 知乎 - 如何证明莫
VJ193259 I - A Boring Problem
2017-10-23
ACM
875
Problem
给定数字串 S,求如下表达式的值:
Solution
用前缀和转化该式。令 ,
则 .
原式转化为:
二项展开,得:
将内外求和对调,使得可以预处理前缀和的和的p次方。式子化为:
预处理 , 复杂度 O(N*K).
对 , 处理每个答案,复杂度也是 O(N*K).
坑点:
每个case末尾输出空格PE. 最后不输出换行会WA... 于是我从PE改到了WA...
Code
#include <cstdio>
#include <cstring>
typedef long long LL;
const LL MOD=1e9+7;
const int MAXN=5e4+10;
char S[MAXN];
LL pre[MAXN];
LL spre[MAXN][110];
LL ppre[MAXN][110];
LL C[110][110];
void init()
{
C[0][0]=1;
for (int i=1;i<=100;++i)
{
C[0
FaceLock - 脸部识别锁屏软件
2017-10-20
Tech
1916
最近(上个月)想研究一下人脸识别。人脸检测。找了下网上的资料,有不少都是介绍怎么用现成的模块识别的。但是我想了解的是用神经网络进行人脸识别,并且希望能够更多地接触神经网络。于是往基于TensorFlow框架的人脸识别方面查找了下,幸运地找到了 Hironsan 的 BossSensor 项目。学习他的代码,我了解到了 Keras 框架搭建神经网络的步骤。在此基础上进行了一系列改进,获得了更高的精度。
Main Produce
预处理 -> 数据集划分 -> 人脸图像输入 -> 卷积神经网络 -> 分类输出 -> 决策
Preprocess
首先,获取人脸图像数据。我是通过 OpenCV 采集照片中的人脸。照片来源:现场取材,身边人下手,手机照片,网络收集。(直到我做完才想起来有个 ImageNet 这种方便的东西。)
然后是读取图像数据,将人脸图像预处理,将图片缩放和增加 Padding, 使得图片的长宽像素一致
SCU-4444 - Travel - 补图最短路
2017-08-27
ACM
-
HDU-5383 - Yu-Gi-Oh! - 最小费用网络流
2017-08-26
ACM
-
VJ-180367-B 神奇的%系列二 - 莫队算法
2017-08-24
ACM
-
POJ-1459 - Power Network - 网络流
2017-08-23
ACM
-
VJudge 173751 B - R2D2 and Droid Army - RMQ & Sparse Table
2017-07-30
ACM
-
VJ 172735 F - Drying; 从TLE到刚好AC再到妥妥AC; 以及超级巨坑iostream
2017-07-25
ACM
-
VJ 172735 A - Can you find it? - 二分查找
2017-07-25
ACM
-