亚洲乱码中文字幕综合,69国产成人精品午夜福中文,日本道二区高清不卡视频

用深度學(xué)習(xí)做個(gè)藝術(shù)畫家 ——模仿實(shí)現(xiàn)PRISMA

更新時(shí)間：2017-09-27 09:43:26 點(diǎn)擊次數(shù)：2194次

使用深度學(xué)習(xí)作畫的起源是有三個(gè)德國研究員想把計(jì)算機(jī)調(diào)教成梵高，他們研發(fā)了一種算法，模擬人類視覺的處理方式。具體是通過訓(xùn)練多層卷積神經(jīng)網(wǎng)絡(luò)，讓計(jì)算機(jī)識(shí)別，并學(xué)會(huì)梵高的“風(fēng)格”，然后將任何一張普通的照片變成梵高的《星空》。

圖片描述

圖9-1 deep art

后來他們開創(chuàng)了Deep Art公司，在Deep Art公司，負(fù)責(zé)繪畫的程序員是卷積神經(jīng)網(wǎng)絡(luò)（CNN）。輸入一個(gè)藝術(shù)作品，比如梵高的《星空》，卷積神經(jīng)網(wǎng)絡(luò)就會(huì)自動(dòng)提取出這幅畫作的“風(fēng)格特征”，并轉(zhuǎn)換成風(fēng)格模板保存下來。也就是說，卷積神經(jīng)網(wǎng)絡(luò)可以被看作是一個(gè)機(jī)器藝術(shù)家。

Prisma次將這項(xiàng)藝術(shù)作畫技術(shù)成功商業(yè)化。Prisma誕生于俄羅斯，是一個(gè)僅有4個(gè)年輕人歷時(shí)一個(gè)半月開發(fā)出的圖片處理應(yīng)用。他們充分考慮了智能手機(jī)覆蓋率的飛速增長，并且細(xì)致研究了用戶行為。Prisma接入的是以億數(shù)量級(jí)的市場(chǎng)，俄國總統(tǒng)梅德韋杰夫也成為了Prisma的用戶，他在Instgram上曬出了一張Prisma作品，迅速獲得8.7萬個(gè)贊。

Google的Deep Dream也是一個(gè)會(huì)畫畫的計(jì)算機(jī)。它能夠自動(dòng)識(shí)別圖像，篩選其中一些部分，進(jìn)行夸張，以創(chuàng)造出一種迷幻效果。Deep Dream完全開源，在幾個(gè)主流的深度學(xué)習(xí)庫如Keras、Caffe的官方example中，都有Deep Dream的實(shí)現(xiàn)Demo。
本章我們將探索實(shí)現(xiàn)類似Prisma的效果。

備注：
本章完整項(xiàng)目地址：https://github.com/bbfamily/prisma_abu
本項(xiàng)目演示視頻：m.v.qq.com/play.html?&vid=v0397sv1fab，也可以在公眾號(hào)abu_quant中直接觀看視頻。

機(jī)器學(xué)習(xí)初探藝術(shù)作畫

好的藝術(shù)家模仿皮毛，偉大的藝術(shù)家竊取靈魂。 ——畢加索

本節(jié)介紹機(jī)器學(xué)習(xí)作畫的簡(jiǎn)單原理，并展示輸出效果。

藝術(shù)作畫概念基礎(chǔ)

第6章介紹了CNN如何提取圖片中的圖形特征，進(jìn)而識(shí)別圖片實(shí)物。現(xiàn)在，假設(shè)這里已經(jīng)訓(xùn)練好了一個(gè)“識(shí)別畢加索繪制的貓”的深層卷積神經(jīng)網(wǎng)絡(luò)模型，如果把一張完全不同的照片輸入模型，比如一張狗的照片，會(huì)發(fā)生什么呢？

圖片描述

圖9-2 CNN反饋修改輸入圖片

模型會(huì)給你反饋一個(gè)概率分?jǐn)?shù)，表示它相信這是一張“畢加索貓”照片的程度。這中間經(jīng)歷了很多CNN層，每層CNN都在狗狗照片上尋找輸入樣本是畢加索貓的圖形特征證據(jù)，越底層的神經(jīng)元分析的特征越具體，越高層越抽象。當(dāng)然，后模型會(huì)給出很低的分?jǐn)?shù)。

上面在狗照上識(shí)別畢加索貓的過程中，如果讓模型能夠修改輸入的樣本又會(huì)怎樣呢？

給模型網(wǎng)絡(luò)中加一個(gè)反饋回路，讓每一層網(wǎng)絡(luò)可以朝著使后分?jǐn)?shù)變大的方向上修改狗狗照片。每次迭代網(wǎng)絡(luò)中的每層都會(huì)在狗照上增加一些畢加索貓的特征痕跡，可以迭代很多次，讓狗狗照片中加入越來越多的畢加索貓的實(shí)物特征。

這就是使用卷積神經(jīng)網(wǎng)絡(luò)藝術(shù)作畫的概念基礎(chǔ)，讓藝術(shù)風(fēng)格模型的CNN按圖形特征修改輸入圖片，疊加藝術(shù)效果。大致的實(shí)現(xiàn)思路如下：

輸入特征圖像，訓(xùn)練風(fēng)格模型，讓計(jì)算機(jī)學(xué)會(huì)藝術(shù)風(fēng)格。
輸入待處理圖，風(fēng)格模型引導(dǎo)修改輸入圖片，生成新的圖像，輸出“藝術(shù)畫”。
接下來將模擬Prisma的效果，實(shí)現(xiàn)藝術(shù)作畫。

直觀感受一下機(jī)器藝術(shù)家

這里我們展示一下機(jī)器藝術(shù)作畫的效果，原圖如圖9-3所示。

圖片描述

圖9-3 原圖

下面我們用這幾個(gè)淺層神經(jīng)元對(duì)原圖風(fēng)格化的效果進(jìn)行展示（備注：代碼實(shí)現(xiàn)見本書Git庫），如圖9-4所示。

圖片描述

圖9-4 機(jī)器風(fēng)格作畫效果展示

有沒有感覺到特征的識(shí)別由淺入深的一步一步增強(qiáng)，也就是從edge，到shape，再到復(fù)雜的shape循序漸進(jìn)的過程，這里主要是卷基層把每層的特質(zhì)放大進(jìn)行夸張凸顯。

圖9-5展示了更多其他風(fēng)格作畫。

圖片描述

圖9-5 機(jī)器藝術(shù)作畫效果圖

一個(gè)有意思的實(shí)驗(yàn)

如果用Prisma做出一個(gè)圖像，然后將它作為特征圖像去引導(dǎo)新的圖像生成會(huì)有什么效果呢？

guide = np.float32(
    tp.resize_img(PIL.Image.open('../prisma_gd/106480401.jpg')))
PrismaHelper.show_array_ipython(guide)

輸出：

圖片描述

圖9-6 引導(dǎo)圖

如圖9-7所示，有些特征還是挖掘到了。

PrismaHelper.show_array_ipython(tp.fit_guide_img(s_file, gd_path, resize=True, size=640, iter_n=1500))

輸出：

圖片描述

圖9-7 效果圖

機(jī)器藝術(shù)作畫的愿景

當(dāng)機(jī)器能夠根據(jù)圖像的平面特征作畫時(shí)，很多靈感也隨之而來。比如我們可以從引導(dǎo)圖中發(fā)現(xiàn)圖像特征，從多個(gè)特征中尋找出存在的對(duì)象，并將這個(gè)特征融合到另一個(gè)圖像中做特征融合。如果能夠?qū)⑻卣髯R(shí)別融合做到極致，就可以完成如下假想場(chǎng)景。

抬起頭看到天邊一朵云，看起來好像我家拉布拉多犬呢，是不是可以替換一下呢？
用手機(jī)拍下這朵云，將狗狗的照片和云的照片發(fā)到云端進(jìn)行特征識(shí)別融合。
云端將融合好后的圖像發(fā)回給用戶。

圖片描述

圖9-8 特征融合的愿景圖

回顧

本節(jié)我們介紹了深度學(xué)習(xí)藝術(shù)作畫的原理，并展示了直觀的效果，在使用Deep Dream等開源項(xiàng)目實(shí)現(xiàn)上述效果時(shí)，速度非常緩慢，所以從9.2節(jié)開始我們將使用自己的方式實(shí)現(xiàn)快速Prisma，實(shí)現(xiàn)秒級(jí)作畫。

實(shí)現(xiàn)秒級(jí)藝術(shù)作畫

天下武功，唯快不破。互聯(lián)網(wǎng)競(jìng)爭(zhēng)的利器就是快。——雷軍

和Deep Art相比，Prisma的優(yōu)勢(shì)在于大大縮短了圖像處理的時(shí)間，每張照片在Prisma系統(tǒng)內(nèi)的處理時(shí)間控制在秒級(jí)別。而Deep Art更像是精工細(xì)作的手藝人，算法跑得雖然慢一些，但在細(xì)節(jié)表現(xiàn)力上更勝一籌。

在本書寫作之前，筆者參考了幾個(gè)藝術(shù)作畫開源項(xiàng)目，都達(dá)不到真實(shí)Prisma的速度要求，本節(jié)將要使用的方式都是筆者原創(chuàng)的方法。首先筆者并不知道Prisma到底使用了什么方式使圖像效果又好，速度又快，但是大概猜測(cè)的方向包括這幾種可能：

大量的多CPU、GPU的機(jī)器（絕對(duì)不現(xiàn)實(shí)，成本根本無法控制）。
未知的算法優(yōu)化、網(wǎng)絡(luò)框架優(yōu)化（就算是這樣，這也是我們沒有能力突破的黑盒）。
擁有很大的圖像數(shù)據(jù)庫，可以很快地檢索出與輸入圖像相似度高的圖像，之后相似特征提取、權(quán)重渲染。
針對(duì)圖像的部分區(qū)域使用機(jī)器學(xué)習(xí)算法將特征層放大，配合一些圖像處理技術(shù)，提升渲染速度。

第三種方式是說在擁有海量圖片數(shù)據(jù)的前提下，作一個(gè)分類模型。對(duì)于一個(gè)輸入圖片，模型先分類出這是哪種類型，根據(jù)類型選擇固定的特征提取方式進(jìn)行渲染。而第四種方式是使用一些圖片預(yù)處理技術(shù)減少機(jī)器學(xué)習(xí)算法的工作量。

由于我們沒有足夠的圖片資源，而且第三點(diǎn)實(shí)現(xiàn)方式的速度瓶頸會(huì)在檢索和相似度計(jì)算上，所以下面講的內(nèi)容是針對(duì)第四點(diǎn)展開試驗(yàn)的，可以在速度及渲染效果上都達(dá)到比較滿意的效果。而的缺陷就在適用性上，實(shí)際使用時(shí)需要調(diào)整一下參數(shù)，所以在實(shí)際使用中可以結(jié)合上述第三種方式，針對(duì)一定數(shù)量的樣本作為訓(xùn)練集x，對(duì)應(yīng)的y是效果參數(shù)，對(duì)輸入進(jìn)行分類，再配合使用相似度等提高自動(dòng)適配的能力。

主要實(shí)現(xiàn)思路分解講解

下面還是使用abu1這張圖片作為輸入：

from PrismaCaffe import CaffePrismaClass import PrismaHelper import numpy as np import PIL.Image

abu1_file = '../sample/abu1.jpg' cp = CaffePrismaClass(dog_mode=False)
PrismaHelper.show_array_ipython(
    np.float32(cp.resize_img(PIL.Image.open(abu1_file))))

輸出如圖9.9所示。

圖片描述

圖9-9 原始圖

下一步，挑一張引導(dǎo)特征圖像：

gd_path = '../prisma_gd/tooopen_sy_127260228921.jpg' guide_img = np.float32(
    cp.resize_img(PIL.Image.open(gd_path), base_width=480,
                  keep_size=False))
PrismaHelper.show_array_ipython(guide_img)

輸出：

圖片描述

圖9-10 引導(dǎo)圖

如下所示，首先將圖像轉(zhuǎn)化為單通道，用otsu尋找mask, 通過mask確定border和edges。

說明：otsu（大津算法, 自適應(yīng)閾值）

關(guān)于skimage otsu等使用請(qǐng)參考：http://scikit-image.org/
關(guān)于scipy ndimage等使用請(qǐng)參考：https://docs.scipy.org

輸入：

r_img = cp.resize_img(PIL.Image.open(abu1_file), base_width=480,
                      keep_size=False) # rgb轉(zhuǎn)化為單通道灰階圖像 l_img = np.float32(r_img.convert('L')) # filters.threshold_otsu需要（－1， 1）之間 l_img = np.float32(l_img / 255)
r_img = np.float32(r_img) # 找出大于otsu的閥值作為mask，需找border mask = l_img > filters.threshold_otsu(l_img) # 不是適用所有圖像都要clear border，比如圖像主題大部分需要保留時(shí)就不需要 clean_border = segmentation.clear_border(mask).astype(np.int)
coins_edges = segmentation.mark_boundaries(l_img, clean_border) # 將值再次轉(zhuǎn)換到0-255 clean_border_img = np.float32(clean_border * 255)
clean_border_img = np.uint8(np.clip(clean_border_img, 0, 255))

PrismaHelper.show_array_ipython(clean_border_img)

輸出：
圖片描述

圖9-11 單通道圖

目標(biāo)就是只想摘取狗狗的圖像，其他的都認(rèn)為是噪音，可以使用ndimage.binary _opening。達(dá)到效果了嗎，試試看：

clean_border_img = ndimage.binary_opening(
    np.float32(clean_border_img / 255),
    structure=np.ones((5, 5))).astype(np.int)
clean_border_img = ndimage.binary_opening(clean_border_img).astype(
    np.int)
PrismaHelper.show_array_ipython(clean_border_img * 255)

輸出：

圖片描述

效果其實(shí)不太好，ndimage.binary_opening效果與CNN中的小池化層相似，目的就是去掉圖中的小物體，下面通過自定義簡(jiǎn)單卷積核來過濾，實(shí)現(xiàn)我們的需求。

代碼如下：

# 小的卷積核目的是保留大體圖像結(jié)構(gòu) n = 5 small_window = np.ones((n, n)) small_window /= np.sum(small_window) clean_border_small = convolve2d(clean_border_img, small_window,
                                mode="same", boundary="fill") # 中號(hào)的卷積核是為了保留圖像的內(nèi)嵌部分，這里的作用就是狗狗的黑鼻子和嘴那部分 n = 25 median_window = np.ones((n, n)) median_window /= np.sum(median_window) clean_border_convd_median = \
    convolve2d(clean_border_img, median_window, mode="same",
               boundary="fill") # 大號(hào)的卷積核，只是為了去除散落的邊緣，很多時(shí)候沒有必要，影響速度和效果 n = 180 big_window = np.ones((n, n)) big_window /= np.sum(big_window) clean_border_convd_big = convolve2d(clean_border_img, big_window,
                                    mode="same", boundary="fill") l_imgs = []
for d in range(3): # 分別對(duì)三個(gè)通道進(jìn)行濾波 rd_img = r_img[:, :, d]
    gd_img = guide_img[:, :, d] # 符合保留條件的使用原始圖像，否則使用特征圖像 d_img = np.where(np.logical_or(
        clean_border_convd_median > 5 * clean_border_convd_big.mean(),
        np.logical_and(clean_border_small > 0, clean_border_convd_big \
                       > 2 * clean_border_convd_big.mean())),
        rd_img, gd_img)
    l_imgs.append(d_img) img_cvt = np.stack(l_imgs, axis=2).astype("uint8") # 對(duì)轉(zhuǎn)換出的圖像進(jìn)行一次簡(jiǎn)單淺層特征放大 d_img = cp.fit_img(nbk='conv2/3x3_reduce', iter_n=10, img_np=img_cvt)
PrismaHelper.show_array_ipython(np.float32(d_img))

輸出：

圖片描述

圖9-13 后效果圖

代碼并不多，主要思路如下：

通過filters.threshold_otsu找出圖像的mask。
segmentation.clear_border(mask)抽取圖像border、edges。
使用三個(gè)卷積核對(duì)圖像進(jìn)行濾波處理，這里的三個(gè)卷積核的分工請(qǐng)看上面的代碼注釋。這里的濾波是就是引導(dǎo)特征和原始圖像的權(quán)重分配。

卷積的意義簡(jiǎn)單理解就是加權(quán)疊加，針對(duì)輸入的單位相應(yīng)得到輸出。為什么要用卷積呢？工程上理解其實(shí)就是為了效率。如果上面的代碼從目的出發(fā)，知道要濾除什么樣的像素點(diǎn)，保留什么樣的像素點(diǎn)，將這些編程為計(jì)算操作；然后使用for循環(huán)針對(duì)每一個(gè)像素點(diǎn)，在一定范圍內(nèi)（卷積核大小）執(zhí)行計(jì)算操作，后使用for循環(huán)一步一步前進(jìn)，其實(shí)也能得出結(jié)果，但是運(yùn)算的時(shí)間復(fù)雜度將大出幾個(gè)數(shù)量級(jí)。

1．使用圖像特征作為mask

上面的方法是使用otsu尋找圖像邊緣作為mask的依據(jù)，下面使用skimage中的corner_peaks抽取圖像特征作為mask。

def show_features(gd_file): r_img = cp.resize_img(PIL.Image.open(gd_file), base_width=480,
                          keep_size=False)
    l_img = np.float32(r_img.convert('L'))
    ll_img = np.float32(l_img / 255)

    coords = corner_peaks(corner_harris(ll_img), min_distance=5)
    coords_subpix = corner_subpix(ll_img, coords, window_size=25)

    plt.figure(figsize=(8, 8))
    plt.imshow(r_img, interpolation='nearest')
    plt.plot(coords_subpix[:, 1], coords_subpix[:, 0], '+r',
             markersize=15, mew=5)
    plt.plot(coords[:, 1], coords[:, 0], '.b', markersize=7)
    plt.axis('off')
    plt.show() def find_features(gd_file=None, r_img=None, l_img=None, loop_factor=1,
                  show=False): if gd_file is not None:
        r_img = cp.resize_img(PIL.Image.open(gd_file), base_width=480,
                              keep_size=False)
        l_img = np.float32(r_img.convert('L'))
        l_img = np.float32(l_img / 255)

    coords = corner_peaks(corner_harris(l_img), min_distance=5)
    coords_subpix = corner_subpix(l_img, coords, window_size=25)

    r_img_copy = np.zeros_like(l_img)
    rd_img = np.float32(r_img)

    r_img_copy[coords[:, 0], coords[:, 1]] = 1 f_loop = int(rd_img.shape[1] / 10 * loop_factor) for _ in np.arange(0, f_loop): """
         放大特征點(diǎn)，使用loop_factor來控制特征放大倍數(shù)
        """ r_img_copy = ndimage.binary_dilation(r_img_copy).astype(
            r_img_copy.dtype)
    r_img_copy_ret = r_img_copy * 255 if show:
        r_img_copy_d = [rd_img[:, :, d] * r_img_copy for d in range(3)]
        r_img_copy = np.stack(r_img_copy_d, axis=2)
        PrismaHelper.show_array_ipython(r_img_copy) return r_img_copy_ret

顯示抽取出的圖像特征點(diǎn)：

show_features('../prisma_gd/71758PICxSa_1024.jpg')

輸出如圖9-14所示。

圖片描述

圖9-14 原始圖

find_features使用ndimage.binary_dilation來放大特征點(diǎn)，使用loop_factor來控制特征放大倍數(shù)，目的是結(jié)合引導(dǎo)特征做渲染時(shí)提升原始圖像的特征權(quán)重， find_features提取后的結(jié)果如下所示。

_ = find_features('../prisma_gd/71758PICxSa_1024.jpg', loop_factor=1,
                  show=True)

輸出如圖9-15所示。

圖片描述

圖9-15 抽取效果圖

下面用IPython Notebook的可交互形式更直觀地看一下特征的抽取與放大。

from ipywidgets import interact def find_features_interact(gd_file, loop_factor): r_img = cp.resize_img(PIL.Image.open(gd_file), base_width=480,
                          keep_size=False)
    l_img = np.float32(r_img.convert('L'))
    l_img = np.float32(l_img / 255)

    coords = corner_peaks(corner_harris(l_img), min_distance=5)
    coords_subpix = corner_subpix(l_img, coords, window_size=25)

    r_img_copy = np.zeros_like(l_img)
    rd_img = np.float32(r_img)

    r_img_copy[coords[:, 0], coords[:, 1]] = 1 f_loop = int(rd_img.shape[1] / 10 * loop_factor) for _ in np.arange(0, f_loop): """
            放大特征點(diǎn)，使用loop_factor來控制特征放大倍數(shù)
        """ r_img_copy = ndimage.binary_dilation(r_img_copy).astype(
            r_img_copy.dtype)
    r_img_copy_ret = r_img_copy * 255 r_img_copy_d = [rd_img[:, :, d] * r_img_copy for d in range(3)]
    r_img_copy = np.stack(r_img_copy_d, axis=2)
    PrismaHelper.show_array_ipython(r_img_copy)


gd_file = ('../prisma_gd/71758PICxSa_1024.jpg', '../prisma_gd/st.jpg', '../prisma_gd/g1.jpg', '../prisma_gd/31K58PICSuH.jpg')
loop_factor = (0, 2, 0.1)
interact(find_features_interact, gd_file=gd_file,
         loop_factor=loop_factor)

輸出如圖9-16所示。

圖片描述

圖9-16 抽取效果圖

使用統(tǒng)計(jì)參數(shù)期望與標(biāo)準(zhǔn)差尋找mask

mask的抽取方法可以有很多種方式，比如下面這個(gè)股票量化統(tǒng)計(jì)中經(jīng)常使用的均值回復(fù)分析，用圖像的均值和標(biāo)準(zhǔn)差來做濾波器，提取圖像的mask。

下面重構(gòu)一下代碼，分別使用三種濾波圖像的方式生成mask查看效果。

（1）def do_otsu(r_img, l_img, cb)代碼封裝。
（2）def do_features(r_img, l_img, cb, loop_factor=1.0，前面介紹的抽取圖像點(diǎn)集特征放大的方式。
（3）def do_stdmean(r_img, l_img, cb, std_factor=1.0，用均值標(biāo)準(zhǔn)差統(tǒng)計(jì)方式抽取mask。

1．使用多種方式Prisma圖片

輸入：

def do_otsu(r_img, l_img, cb, dd=True): mask = l_img > filters.threshold_otsu(
        l_img) if dd else l_img < filters.threshold_otsu(l_img)
    clean_border = mask if cb:
        clean_border = segmentation.clear_border(mask).astype(np.int)
    clean_border_img = np.float32(clean_border * 255)
    clean_border_img = np.uint8(np.clip(clean_border_img, 0, 255)) return clean_border_img def do_features(r_img, l_img, cb, loop_factor=1.0): mask = find_features(r_img=r_img, l_img=l_img,
                         loop_factor=loop_factor)
    clean_border = mask if cb:
        clean_border = segmentation.clear_border(mask).astype(np.int)
    clean_border_img = np.float32(clean_border * 255)
    clean_border_img = np.uint8(np.clip(clean_border_img, 0, 255)) return clean_border_img """
    感受一下圖像濾波的部分
""" def do_stdmean(r_img, l_img, cb, std_factor=1.0): mean_img = l_img.mean()
    std_img = l_img.std()

    mask1 = l_img > mean_img + (std_img * std_factor)
    mask2 = l_img < mean_img - (std_img * std_factor)

    clean_border = mask1 if cb:
        clean_border = segmentation.clear_border(mask1).astype(np.int)
    clean_border_img1 = np.float32(clean_border * 255)
    clean_border_img1 = np.uint8(np.clip(clean_border_img1, 0, 255))

    clean_border = mask2 if cb:
        clean_border = segmentation.clear_border(mask2).astype(np.int)
    clean_border_img2 = np.float32(clean_border * 255)
    clean_border_img2 = np.uint8(np.clip(clean_border_img2, 0, 255)) # 上下兩部分組合 clean_border_img = clean_border_img1 + clean_border_img2
    clean_border_img = np.uint8(np.clip(clean_border_img, 0, 255)) return clean_border_img """
    將多個(gè)mask func用“與”的形式組合成mask濾波器 
    exp: tgt_mask_func = partial(together_mask_func,
        func_list=[do_otsu, mask_stdmean_func, mask_features_func])
""" def together_mask_func(r_img, l_img, cb, func_list): clean_border_img = None for func in func_list: if not callable(func): raise TypeError("together_mask_func must a func!!!")
        border_img = func(r_img, l_img, cb) if clean_border_img is None:
            clean_border_img = border_img else:
            clean_border_img = clean_border_img + border_img
    clean_border_img = np.uint8(np.clip(clean_border_img, 0, 255)) return clean_border_img """
    使用partial統(tǒng)一mask函數(shù)接口形式
""" mask_stdmean_func = partial(do_stdmean, std_factor=1.0)
mask_features_func = partial(do_features, loop_factor=0.88) def do_convd_filter(n1, n2, n3, rb_rate, r_img, guide_img,
                    clean_border_img, convd_median_factor,
                    convd_big_factor): n = n1
    small_window = np.ones((n, n))
    small_window /= np.sum(small_window)
    clean_border_small = convolve2d(clean_border_img, small_window,
                                    mode="same", boundary="fill")

    n = n2
    median_window = np.ones((n, n))
    median_window /= np.sum(median_window)
    clean_border_convd_median = \
        convolve2d(clean_border_img, median_window, mode="same",
                   boundary="fill")

    n = n3
    big_window = np.ones((n, n))
    big_window /= np.sum(big_window)
    clean_border_convd_big = \
        convolve2d(clean_border_img, big_window,
                   mode="same", boundary="fill")

    l_imgs = [] for d in range(3): """
            針對(duì)rgb各個(gè)通道處理
        """ rd_img = r_img[:, :, d]
        gd_img = guide_img[:, :, d]

        wn = [] for _ in np.arange(0, rd_img.shape[1]): """
                二項(xiàng)式概率分布
            """ wn.append(np.random.binomial(1, rb_rate, rd_img.shape[0])) if rb_rate <> 1: """
                針對(duì)rgb通道階梯下降二項(xiàng)式概率
            """ rb_rate = rb_rate - 0.1 w = np.stack(wn, axis=1)

        d_img = np.where(np.logical_or(
            np.logical_and(
                clean_border_convd_median > convd_median_factor \
                * clean_border_convd_big.mean(), w == 1),
            np.logical_and(
                np.logical_and(clean_border_small > 0, w == 1),
                clean_border_convd_big > convd_big_factor \
                * clean_border_convd_big.mean())),
            rd_img, gd_img)

        l_imgs.append(d_img)
    img_cvt = np.stack(l_imgs, axis=2).astype("uint8") return img_cvt def mix_mask_with_convd(do_mask_func, org_file=None, gd_file=None,
                        nbk=None, enhance=None, n1=5, n2=38, n3=1,
                        convd_median_factor=5.0, convd_big_factor=0.0,
                        cb=False, rb_rate=1, r_img=None,
                        guide_img=None, all_mask=False, show=False): if r_img is None:
        r_img = cp.resize_img(PIL.Image.open(org_file),
                              base_width=480, keep_size=False)

    l_img = np.float32(r_img.convert('L'))
    l_img = np.float32(l_img / 255)
    r_img = np.float32(r_img) if show:
        PrismaHelper.show_array_ipython(np.float32(r_img)) if not callable(do_mask_func): raise TypeError( 'mix_mask_with_convd must do_mask_func a func')

    clean_border_img = np.ones_like(
        l_img) * 255 if all_mask else do_mask_func(r_img=r_img,
                                                   l_img=l_img, cb=cb) if show:
        PrismaHelper.show_array_ipython(np.float32(clean_border_img)) if guide_img is None: if gd_file is not None:
            guide_img = np.float32(
                cp.resize_img(PIL.Image.open(gd_file), base_width=480,
                              keep_size=False)) else:
            guide_img = np.zeros_like(r_img)

    img_cvt = do_convd_filter(n1, n2, n3, rb_rate, r_img, guide_img,
                              clean_border_img,
                              convd_median_factor=convd_median_factor,
                              convd_big_factor=convd_big_factor) if nbk is not None:
        img_cvt = cp.fit_img(org_file, nbk=nbk, iter_n=10,
                             enhance=enhance, img_np=img_cvt) if show:
        PrismaHelper.show_array_ipython(np.float32(img_cvt)) return img_cvt

使用特征do_features mask方式，注意rb_rate=0.66的使用，這里使用它的目的是使特征邊緣平滑過渡到引導(dǎo)特征中, 當(dāng)然這里還可以有各種優(yōu)化方式，比如向下調(diào)整convd_median_factor，使原始特征邊緣提取更加圓潤平滑。

_ = mix_mask_with_convd(partial(do_features, loop_factor=1.1), '../prisma_gd/71758PICxSa_1024.jpg', '../prisma_gd/cx6.jpg', 'conv2/3x3_reduce', rb_rate=0.66, show=True)

輸出如圖9-17所示。

圖片描述

圖9-17 效果圖

球員庫里這張圖使用均值回復(fù)mask方式可以對(duì)圖片產(chǎn)生比較好的效果，使用其他兩種方式效果均不佳，讀者可以自行測(cè)試效果。

_ = mix_mask_with_convd(mask_stdmean_func, '../sample/kl.jpg', '../prisma_gd/cx7.jpg', 'conv2/3x3_reduce',
                        n2=88, convd_median_factor=0.1, rb_rate=1,
                        show=True)

輸出如圖9-18所示。

圖片描述

圖9-18 效果圖

使用abu2看看現(xiàn)在這種方式的運(yùn)行效率，%time計(jì)算一下耗時(shí)。請(qǐng)注意參數(shù)，n3=1、convd_median_factor=0.2、convd_big_factor=0.0。也就是說，不使用大的卷積核，速度會(huì)非常快，只用了5.62 s，并且包含一些額外的代碼工作，如顯示原圖等。

% time _ = mix_mask_with_convd(do_otsu, '../sample/abu2.jpg', '../prisma_gd/k5.jpg', 'conv2/3x3_reduce',
                        n3=1, convd_median_factor=0.2,
                        convd_big_factor=0.0, show=True)

輸出如圖9-19所示。

圖片描述

圖9-19 效果圖

abu5使用partial(together_mask_func, func_list=[do_otsu, mask_features_func])，組合多個(gè)特征抽取mask函數(shù)，多個(gè)濾波函數(shù)以“與的關(guān)系”進(jìn)行組合，對(duì)圖像進(jìn)行mask。

這里如果不使用together_mask_func，則單獨(dú)每個(gè)都要再次調(diào)整一些參數(shù)。比如單獨(dú)使用do_otsu，就要調(diào)大n2核的大小，影響了速度，而mask函數(shù)合并特征完美快速實(shí)現(xiàn)了需求。

tgt_mask_func = partial(together_mask_func,
                        func_list=[do_otsu, mask_features_func])
_ = mix_mask_with_convd(tgt_mask_func, '../sample/abu3.jpg', '../prisma_gd/cx3.jpg', 'conv2/3x3_reduce',
                        cb=False, n2=68, n3=1, convd_median_factor=1,
                        convd_big_factor=0.0, show=True)

輸出如圖9-20所示。

圖片描述

圖9-20 效果圖

2．配合使用預(yù)處理圖像增強(qiáng)，隨機(jī)rgb淺層edges等增強(qiáng)Prisma效果

下面對(duì)筆者的偶像艾弗森，使用圖像預(yù)處理的方式來做圖。首先用Contrast，注意這里設(shè)置了參數(shù)rb_rate=0.88，從代碼來看它的作用是：

 wn = []
for _ in np.arange(0, rd_img.shape[0]):
    wn.append(np.random.binomial(1, rb_rate, rd_img.shape[1]))
if rb_rate <> 1:
    rb_rate = rb_rate - 0.1 w = np.stack(wn, axis=1)

d_img = np.where(np.logical_or(np.logical_and(
    clean_border_convd_median > convd_median_factor * \
    clean_border_convd_big.mean(), w == 1), np.logical_and(
    np.logical_and(clean_border_small > 0, w == 1),
    clean_border_convd_big > convd_big_factor * \
    clean_border_convd_big.mean())), rd_img, gd_img)

邏輯中 np.logical_and添加w == 1的判斷，這里使用二項(xiàng)式分布，增強(qiáng)渲染的迷幻效果，即隨機(jī)在rgb某一個(gè)通道中渲染一下引導(dǎo)特征。

rb_rate = rb_rate - 0.1 的作用是3個(gè)通道的隨機(jī)渲染概率階梯下降，這里也可以有其他各種渲染變種。

例如，這里的w二項(xiàng)式分布矩陣就類似下面這個(gè)示例矩陣，針對(duì)圖像中每一個(gè)像素點(diǎn)，0.8的概率為1，0.2的概率為0。

wn = []
for _ in np.arange(0, 20):
    wn.append(np.random.binomial(1, 0.8, 20))
w = np.stack(wn, axis=1)
w

輸出：

array([[1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1], [1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0], [1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0], [1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1], [0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1], [1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1], [1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0], [1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1], [0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1], [1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1], [1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0], [1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0], [1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1], [1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1], [1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1], [0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1], [1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1]])

輸入：

tgt_mask_func = partial(together_mask_func,
                        func_list=[do_otsu, mask_stdmean_func])
_ = mix_mask_with_convd(tgt_mask_func, '../sample/lfs.jpg', '../prisma_gd/cx11.jpg', 'conv2/norm2', enhance='Contrast',
                        rb_rate=0.88, n2=180, n3=1,
                        convd_median_factor=0.01,
                        convd_big_factor=0.0, show=True)

輸出如圖9-21所示。

圖片描述

圖9-21 效果圖

感覺不是很帥，那怎么行，前置一個(gè)Sharpness預(yù)處理效果看看：

_ = mix_mask_with_convd(do_otsu, '../sample/lfs.jpg', '../prisma_gd/cx11.jpg', 'conv2/norm2',
                        enhance='Sharpness', rb_rate=0.88, cb=False,
                        n2=188, n3=1, convd_median_factor=0.01,
                        convd_big_factor=0.0, show=True)

輸出如圖9-22所示。

圖片描述

圖9-22 效果圖

上面做出的兩個(gè)效果，仔細(xì)觀察就會(huì)發(fā)現(xiàn)，其實(shí)他們并沒有使用引導(dǎo)圖的sharp特征，只是通過階梯rgb渲染，在原始圖像上潑上了一層淺層的edges特征，這樣實(shí)際上你不需要使用上面這種實(shí)現(xiàn)方式。注意mix_mask_with_convd的參數(shù)all_mask，當(dāng)all_mask為true時(shí)，將整個(gè)圖像的mask全設(shè)置255，代碼如下：

clean_border_img = np.ones_like(l_img) * 255 if all_mask else do_mask_func(r_img=r_img, l_img=l_img, cb=cb)

所以如果想使用淺層特征edges，直接設(shè)置all_mask就可以了。

工程代碼封裝結(jié)構(gòu)及使用示例

將上面的代碼再次重構(gòu)到文件PrismaWorker中, 代碼詳情請(qǐng)查閱PrismaWorker.py。

from PrismaWorker import PrismaWorkerClass
pw = PrismaWorkerClass()

使用兩個(gè)GTA5的圖片，融合摩托車大哥到大部隊(duì)中，作為圖9-23所示的原始圖片的引導(dǎo)特征圖：

pw.cp.resize_img(PIL.Image.open('../prisma_gd/gta2.jpg'),
                 base_width=480, keep_size=False)

輸出如圖9-23所示。

圖片描述

圖9-23 原始圖片

注意，partial(do_otsu, dd=False)中的dd參數(shù)代表otsu后是取內(nèi)部還是取反向的外部，如下面的黑白mask圖，dd=False可以取到騎手，否則將取到外部背景。

_ = pw.mix_mask_with_convd(partial(pw.do_otsu, dd=False), '../sample/gta4.jpg', '../prisma_gd/gta2.jpg', 'conv2/3x3_reduce', enhance='Sharpness', n2=88, n3=1, convd_median_factor=1.5,
                           convd_big_factor=0.0, show=True)

輸出如圖9-24所示。

圖片描述

圖9-24 效果圖

效果還算不錯(cuò)，除了左下腳兩個(gè)標(biāo)準(zhǔn)的重疊。

接下來使用批量處理引導(dǎo)圖像，預(yù)處理、特征放大層等參數(shù)排列組合使用PrismaMaster，詳情請(qǐng)查詢代碼PrismaMaster.py。如下代碼所示，可以生成所有參數(shù)排列組合的輸出結(jié)果。

import PrismaMaster

cb = False n1 = 5 n2 = 38 n3 = 1 convd_median_factor = 0.6 convd_big_factor = 0.0 loop_factor = 1.0 std_factor = 0.88 nbk_list = filter(lambda nbk: nbk[-8:-1] <> '_split_',
                  cp.net.blobs.keys()[1:-2])[:10]
org_file_list = ['../sample/bz1.jpg']
gd_file_list = [None, '../cx/cx3.jpg', '../cx/cx6.jpg', '../cx/cx7.jpg', '../cx/cx10.jpg']
enhance_list = [None, 'Sharpness', 'Contrast']
rb_rate_list = [0.85, 1.0]

save_dir = '../out/2016_11_24' PrismaMaster.product_prisma(org_file_list, gd_file_list, nbk_list,
                            enhance_list, rb_rate_list, 'otsu_func',
                            n1, n2, n3, std_factor, loop_factor,
                            convd_median_factor,
                            convd_median_factor, cb, save_dir)

接下來再做一個(gè)GUI的可視化操作界面PrismaController, 使用了traitsui庫，詳情請(qǐng)查看PrismaController.py。

圖片描述

圖9-25 GUI控制界面

基于這樣一個(gè)方便微調(diào)的GUI下可以很方便地對(duì)圖像進(jìn)行微調(diào)，做出很多酷炫的圖像，比如圖9-26所示的做的兩張基于GTA風(fēng)格的圖像。

圖片描述

圖9-26 GTA風(fēng)格效果圖

圖9-26這張犀利哥和GTA5合體的原圖素材如圖9-27所示。

圖片描述

圖9-27 原始圖片

如果你不知道什么樣效果好或者想要所有可能的效果圖，那么你可以看到GUI的界面上還有個(gè)按鈕“使用參數(shù)批量藝術(shù)圖片”，它的作用是使用剛剛調(diào)整好的n1, n2, dd等參數(shù)作為固定參數(shù)，將引導(dǎo)特征圖、放大層特征、預(yù)處理增強(qiáng)等作為所有可能的排列組合，通過一鍵生成成百上千張的風(fēng)格圖像，代碼詳情請(qǐng)查看PrismaController.py。

回顧和后記

本節(jié)所講的這種實(shí)現(xiàn)Prisma的方式，不代表任何真實(shí)情況，只是一種可能的技術(shù)實(shí)現(xiàn)思路，并且在這種思路下還需要做很多的工作比如，針對(duì)適用性的問題也許要保存大量字典，字典的key可以是圖像矩陣特征，value對(duì)應(yīng)著處理參數(shù)，然后針對(duì)輸入的圖像進(jìn)行分類，或者根據(jù)特征相似度匹配來認(rèn)定應(yīng)該使用哪些參數(shù)等種種復(fù)雜問題需要處理。
本節(jié)的代碼并沒有過多關(guān)心運(yùn)行效率等問題，比如針對(duì)圖像保存讀取scipy.misc比用PIL的實(shí)現(xiàn)方式效率要高得多，但為了代碼可讀性，本書選擇使用PIL。

總的來說，本章只想告訴你，如果希望機(jī)器學(xué)習(xí)技術(shù)無縫地落地到某個(gè)領(lǐng)域時(shí)，需要的不僅僅是深度學(xué)習(xí)模型技術(shù)，還有靈活的思路以及變通的智慧。

本站文章版權(quán)歸原作者及原出處所有。內(nèi)容為作者個(gè)人觀點(diǎn)，并不代表本站贊同其觀點(diǎn)和對(duì)其真實(shí)性負(fù)責(zé)，本站只提供參考并不構(gòu)成任何投資及應(yīng)用建議。本站是一個(gè)個(gè)人學(xué)習(xí)交流的平臺(tái)，網(wǎng)站上部分文章為轉(zhuǎn)載，并不用于任何商業(yè)目的，我們已經(jīng)盡可能的對(duì)作者和來源進(jìn)行了通告，但是能力有限或疏忽，造成漏登，請(qǐng)及時(shí)聯(lián)系我們，我們將根據(jù)著作權(quán)人的要求，立即更正或者刪除有關(guān)內(nèi)容。本站擁有對(duì)此聲明的最終解釋權(quán)。

用深度學(xué)習(xí)做個(gè)藝術(shù)畫家 ——模仿實(shí)現(xiàn)PRISMA

機(jī)器學(xué)習(xí)初探藝術(shù)作畫

藝術(shù)作畫概念基礎(chǔ)

直觀感受一下機(jī)器藝術(shù)家

一個(gè)有意思的實(shí)驗(yàn)

機(jī)器藝術(shù)作畫的愿景

回顧

實(shí)現(xiàn)秒級(jí)藝術(shù)作畫

主要實(shí)現(xiàn)思路分解講解

使用統(tǒng)計(jì)參數(shù)期望與標(biāo)準(zhǔn)差尋找mask

工程代碼封裝結(jié)構(gòu)及使用示例

回顧和后記

沙克云

定制

關(guān)于