Skip to content
项目
群组
代码片段
帮助
正在加载...
帮助
为 GitLab 提交贡献
登录/注册
切换导航
Z
zion
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
分枝图
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
1
合并请求
1
CI / CD
CI / CD
流水线
作业
计划
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
分枝图
统计图
创建新议题
作业
提交
议题看板
打开侧边栏
zhengfg
zion
Commits
e91ff555
提交
e91ff555
authored
11月 01, 2019
作者:
梁业锦
💬
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
增加了Coach和H&M的爬虫
上级
aa1b6c54
隐藏空白字符变更
内嵌
并排
正在显示
11 个修改的文件
包含
508 行增加
和
55 行删除
+508
-55
SpiderSpecification.md
doc/SpiderSpecification.md
+79
-25
CoachSpider.java
...ava/com/diaoyun/zion/chinafrica/bis/impl/CoachSpider.java
+47
-0
HmSpider.java
...n/java/com/diaoyun/zion/chinafrica/bis/impl/HmSpider.java
+13
-23
ZaraSpider.java
...java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java
+2
-3
PlatformEnum.java
.../java/com/diaoyun/zion/chinafrica/enums/PlatformEnum.java
+2
-0
ItemSpiderFactory.java
...om/diaoyun/zion/chinafrica/factory/ItemSpiderFactory.java
+8
-0
SpiderServiceImpl.java
...aoyun/zion/chinafrica/service/impl/SpiderServiceImpl.java
+8
-3
CoachSpiderParse.java
...com/diaoyun/zion/master/util/spider/CoachSpiderParse.java
+160
-0
HMSpiderParse.java
...va/com/diaoyun/zion/master/util/spider/HMSpiderParse.java
+176
-0
LeviSpiderParse.java
.../com/diaoyun/zion/master/util/spider/LeviSpiderParse.java
+1
-1
SpiderUtil.java
.../java/com/diaoyun/zion/master/util/spider/SpiderUtil.java
+12
-0
没有找到文件。
doc/SpiderSpecification.md
浏览文件 @
e91ff555
...
@@ -27,87 +27,113 @@
...
@@ -27,87 +27,113 @@
-
命名:pullandbear
-
命名:pullandbear
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
有反爬机制,有时会直接失效,不稳定
-
有反爬机制,有时会直接失效,不稳定
-
缺陷:
-
颜色款式数据有误
-
尺码未对应样式
### [Gap](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/GapItemSpider.java)
### [Gap](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/GapItemSpider.java)
-
主页:https://www.gap.cn/
-
主页:https://www.gap.cn/
-
命名:gap
-
命名:gap
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
失效,无法爬取数据
### [Zara](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java)
### [Zara](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java)
-
主页:https://www.zara.cn/cn
-
主页:https://www.zara.cn/cn
-
命名:zara
-
命名:zara
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
### [Uniqlo(优衣库)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UniqloSpider.java)
-
可能存在的缺陷:
### [Uniqlo(优衣库)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UniqloSpider.java)
-
主页:https://www.uniqlo.cn/UNIQLO_U19FW_MEN.html
-
主页:https://www.uniqlo.cn/UNIQLO_U19FW_MEN.html
-
命名:uniqlo
-
命名:uniqlo
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
App无法爬取数据
-
可能存在的缺陷:
-
可能存在的缺陷:
-
图片的路径是直接下载图片
-
图片的路径是直接下载图片
### [Nike(耐克)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/NikeItemSpider.java)
### [Nike(耐克)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/NikeItemSpider.java)
-
主页:https://www.nike.com/cn
-
主页:https://www.nike.com/cn
-
命名:nike
-
命名:nike
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
### [Adidas
(阿迪达斯)
](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/AdidasSpider.java)
### [Adidas
(阿迪达斯)
](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/AdidasSpider.java)
-
主页:https://www.adidas.com.cn/
-
主页:https://www.adidas.com.cn/
-
命名:adidas
-
命名:adidas
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
可用但存在的缺陷:
### H&M
-
商品尺码不对应
### [H&M](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/HmSpider.java)
-
主页:https://www2.hm.com/zh_cn/
-
主页:https://www2.hm.com/zh_cn/
-
命名:hm
-
命名:hm
-
爬虫进度:已能获取到数据
-
爬虫进度:已能获取到数据
-
Json被做了一些难处理的封装,现有工具无法将其转换为Json格式
-
Json被做了一些难处理的封装,现有工具无法将其转换为Json格式
-
商品颜色通过商品详情页的url来区分,暂未找到规律
-
商品颜色通过商品详情页的url来区分,暂未找到规律
### LiLy
### LiLy
-
主页:http://www.lily.sh.cn/webapp/wcs/stores/servlet/lilystore
-
主页:http://www.lily.sh.cn/webapp/wcs/stores/servlet/lilystore
-
命名:lily
-
命名:lily
-
爬虫进度:已完成分析,待处理
-
爬虫进度:已完成分析,待处理
-
数据嵌在HTML中,数据较难处理,延后爬取
-
数据嵌在HTML中,数据较难处理,延后爬取
### Eifini
### Eifini
-
主页:https://eifini.tmall.com/
-
主页:https://eifini.tmall.com/
-
命名:eifini
-
命名:eifini
-
爬虫进度:未知方法
-
爬虫进度:未知方法
-
该购物网站是天猫代理的商城
-
该购物网站是天猫代理的商城
### [Urban Revivo](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UrbanRevivoSpider.java)
### [Urban Revivo](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UrbanRevivoSpider.java)
-
主页:http://www.ur.cn/index.html
-
主页:http://www.ur.cn/index.html
-
命名:urbanrevivo
-
命名:urbanrevivo
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
商品案例:http://wap.ur.com.cn/product/detail?productColorId=ff8080816dbb693e016dfd58f27c45d9
-
数据来源:
-
数据接口:http://wap.ur.com.cn/product/product/detail?id=ff8080816dbb693e016dfd58f27c45d9
-
商品案例:http://wap.ur.com.cn/product/detail?productColorId=ff8080816dbb693e016dfd58f27c45d9
-
数据接口:http://wap.ur.com.cn/product/product/detail?id=ff8080816dbb693e016dfd58f27c45d9
-
可用但存在的缺陷:
### Aber Crombie & Fitch
### Aber Crombie & Fitch
-
主页:https://www.abercrombie.cn/zh_CN/home
-
主页:https://www.abercrombie.cn/zh_CN/home
-
命名:abercrombie
-
命名:abercrombie
-
爬虫进度:存在反爬机制
-
爬虫进度:存在反爬机制
-
链接做了编码形式的反爬机制
-
链接做了编码形式的反爬机制
### [Under Armour
(安德玛)
](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UnderArmourSpider.java)
### [Under Armour
(安德玛)
](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UnderArmourSpider.java)
-
主页:https://www.underarmour.cn/
-
主页:https://www.underarmour.cn/
-
命名:ur
-
命名:ur
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
可用但存在的缺陷:
### Converse(匡威)
-
效率太慢
-
主图失效
-
尺码不对应库存
### Converse(匡威)
-
主页:https://www.converse.com.cn/
-
主页:https://www.converse.com.cn/
-
命名:converse
-
命名:converse
-
爬虫进度:存在反爬机制
-
爬虫进度:存在反爬机制
-
存在反向代理的反爬机制,暂无法爬取
-
存在反向代理的反爬机制,暂无法爬取
### [Ochirly](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/OchirlySpider.java)
### [Ochirly
(欧时力)
](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/OchirlySpider.java)
-
主页:http://www.ochirly.com.cn/SALE/list.shtml
-
主页:http://www.ochirly.com.cn/SALE/list.shtml
-
命名:ochirly
-
命名:ochirly
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
可用但存在的缺陷:
### [Esprit](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/EspritSpider.java)
### [Esprit(埃斯普利特)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/EspritSpider.java)
-
主页:https://www.esprit.cn/
-
主页:https://www.esprit.cn/
-
命名:esprit
-
命名:esprit
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
App爬取数据失效
### [Levi
(李维斯)
](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/LeviSpider.java)
### [Levi
(李维斯)
](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/LeviSpider.java)
-
主页:https://www.levi.com.cn/
sale#page=3
-
主页:https://www.levi.com.cn/
-
命名:levi
-
命名:levi
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
### [MO&Co.](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/MocoSpider.java)
-
App爬取数据失效
### [MO&Co.(摩安珂)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/MocoSpider.java)
-
主页:https://www.moco.com/moco/zh/c/BS_DISCOUNT
-
主页:https://www.moco.com/moco/zh/c/BS_DISCOUNT
-
命名:moco
-
命名:moco
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
可用但存在的缺陷:
-
主图失效
### [Massimo Dutti](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/MassimoduttiSpider.java)
### [Massimo Dutti](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/MassimoduttiSpider.java)
-
主页:https://www.massimodutti.cn/cn/男装/季末折扣/休闲西装-c1745921.html
-
主页:https://www.massimodutti.cn/cn/男装/季末折扣/休闲西装-c1745921.html
-
命名:massimodutti
-
命名:massimodutti
...
@@ -115,18 +141,22 @@
...
@@ -115,18 +141,22 @@
-
数据来源
-
数据来源
-
商品详情:https://www.massimodutti.cn/cn/%E5%A5%B3%E8%A3%85/%E7%B3%BB%E5%88%97/%E8%A1%AC%E8%A1%AB%E5%92%8C%E7%BD%A9%E8%A1%AB/%E8%A1%AC%E8%A1%AB/%E6%BB%91%E9%9B%AA%E9%A3%8E%E7%B3%BB%E5%88%97%E9%A5%B0%E5%8F%A3%E8%A2%8B%E8%A1%AC%E8%A1%AB-c1718602p8730105.html?colorId=420&categoryId=1718602
-
商品详情:https://www.massimodutti.cn/cn/%E5%A5%B3%E8%A3%85/%E7%B3%BB%E5%88%97/%E8%A1%AC%E8%A1%AB%E5%92%8C%E7%BD%A9%E8%A1%AB/%E8%A1%AC%E8%A1%AB/%E6%BB%91%E9%9B%AA%E9%A3%8E%E7%B3%BB%E5%88%97%E9%A5%B0%E5%8F%A3%E8%A2%8B%E8%A1%AC%E8%A1%AB-c1718602p8730105.html?colorId=420&categoryId=1718602
-
数据接口:https://www.massimodutti.cn/itxrest/2/catalog/store/35009478/30359500/category/0/product/8730105/detail?languageId=-7&appId=1
-
数据接口:https://www.massimodutti.cn/itxrest/2/catalog/store/35009478/30359500/category/0/product/8730105/detail?languageId=-7&appId=1
-
App数据爬取失效
### COACH
### [COACH(蔻驰)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/CoachSpider.java)
-
主页:https://china.coach.com/women.html
-
主页:https://china.coach.com/women.html
-
命名:coach
-
命名:coach
-
爬虫进度:
-
爬虫进度:
**已完成**
-
数据来源
-
商品详情:https://china.coach.com/coach-essentials-oversize-cardigan/69007_LPK.html?c=8664
-
数据接口:https://china.coach.com/rest/default/V1/applet/product/CONF69007_LPK
-
存在缺陷:还需要判断是否存在颜色或尺寸的数据
### Revolve
### Revolve
-
主页:https://www.revolve.com/wrangler/br/57f1a1/?utm_source=baidu&utm_medium=cpc&utm_campaign=intl_P_cn-d-Wrangler
-
主页:https://www.revolve.com/wrangler/br/57f1a1/?utm_source=baidu&utm_medium=cpc&utm_campaign=intl_P_cn-d-Wrangler
-
命名:reolve
-
命名:reolve
-
爬虫进度:
-
爬虫进度:
### Vans
### Vans
(范斯)
-
主页:https://vans.com.cn/gallery-index---0---36.html
-
主页:https://vans.com.cn/gallery-index---0---36.html
-
命名:Vans
-
命名:Vans
...
@@ -174,11 +204,35 @@
...
@@ -174,11 +204,35 @@
-
[
ZaraSpider.java
](
../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java
)
-
[
ZaraSpider.java
](
../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java
)
-
如何处理数据详情请看爬虫的@see注释
-
如何处理数据详情请看爬虫的@see注释
# Java 处理爬取的数据
# Java 处理爬取数据工具
## 获取内容工具
-
获取链接内的内容,以字符串的形式
-
String getContentByUrl(String sourceUrl, String sourceType)
-
翻译方法
-
void translateProductResponse(JSONObject resultObj)
-
汇率转换
-
String exchangeRate(String fullPrice)
## Json 数据的处理
## Json 数据的处理
### 常用的方法
-
转换为 Json 数据
-
JSONObject.fromObject(Object object)
-
获取节点对象
-
JSONObject getJSONObject(String key)
-
获取节点数组
-
JSONArray getJSONArray(String key)
-
获取数据,以字符串的形式
-
String getString(String key)
## HTML 数据的处理
## HTML 数据的处理
-
转换为 Docment 对象
-
Jsoup.parse(Object object)
-
获取 HTML 的标签,cssQuery有指定的选取标签的语法
-
Elements select(String cssQuery)
-
获取标签内属性的值,attributeKey是选取属性名
-
String attr(String attributeKey)
-
List
<String>
eachAttr(String attributeKey)
-
获取标签所标记的文本内容
-
String text()
-
List
<String>
eachText()
# 爬虫Json数据返回规范
# 爬虫Json数据返回规范
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/CoachSpider.java
0 → 100644
浏览文件 @
e91ff555
package
com
.
diaoyun
.
zion
.
chinafrica
.
bis
.
impl
;
import
com.diaoyun.zion.chinafrica.bis.IItemSpider
;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.vo.ProductResponse
;
import
com.diaoyun.zion.master.util.HttpClientUtil
;
import
com.diaoyun.zion.master.util.TranslateHelper
;
import
com.diaoyun.zion.master.util.spider.CoachSpiderParse
;
import
net.sf.json.JSONObject
;
import
org.slf4j.Logger
;
import
org.slf4j.LoggerFactory
;
import
org.springframework.stereotype.Component
;
import
java.io.IOException
;
import
java.net.URISyntaxException
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.TimeoutException
;
/**
* COACH(蔻驰)
*
* @author 爱酱油不爱醋
*/
@Component
(
"coachSpider"
)
public
class
CoachSpider
implements
IItemSpider
{
private
static
Logger
logger
=
LoggerFactory
.
getLogger
(
ZaraSpider
.
class
);
/**
* Coach 数据爬虫
* @see CoachSpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public
JSONObject
captureItem
(
String
targetUrl
)
throws
URISyntaxException
,
IOException
,
ExecutionException
,
InterruptedException
,
TimeoutException
{
String
[]
urlSpilt
=
targetUrl
.
split
(
"/"
);
String
[]
pIdSpilt
=
urlSpilt
[
4
].
split
(
".html"
);
String
pId
=
pIdSpilt
[
0
];
targetUrl
=
"https://"
+
urlSpilt
[
2
]
+
"/rest/default/V1/applet/product/CONF"
+
pId
;
String
content
=
HttpClientUtil
.
getContentByUrl
(
targetUrl
,
PlatformEnum
.
COACH
.
getValue
());
JSONObject
resultObj
=
JSONObject
.
fromObject
(
content
);
ProductResponse
productResponse
=
CoachSpiderParse
.
formatProductResponse
(
resultObj
,
pId
);
resultObj
=
JSONObject
.
fromObject
(
productResponse
);
TranslateHelper
.
translateProductResponse
(
resultObj
);
return
resultObj
;
}
}
\ No newline at end of file
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/HmSpider.java
浏览文件 @
e91ff555
...
@@ -2,15 +2,25 @@ package com.diaoyun.zion.chinafrica.bis.impl;
...
@@ -2,15 +2,25 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import
com.diaoyun.zion.chinafrica.bis.IItemSpider
;
import
com.diaoyun.zion.chinafrica.bis.IItemSpider
;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.vo.ProductResponse
;
import
com.diaoyun.zion.master.util.HttpClientUtil
;
import
com.diaoyun.zion.master.util.HttpClientUtil
;
import
com.diaoyun.zion.master.util.JsoupUtil
;
import
com.diaoyun.zion.master.util.JsoupUtil
;
import
com.diaoyun.zion.master.util.TranslateHelper
;
import
com.diaoyun.zion.master.util.spider.HMSpiderParse
;
import
net.sf.json.JSONObject
;
import
net.sf.json.JSONObject
;
import
org.jsoup.Jsoup
;
import
org.jsoup.nodes.Document
;
import
org.jsoup.nodes.Element
;
import
org.jsoup.select.Elements
;
import
org.slf4j.Logger
;
import
org.slf4j.Logger
;
import
org.slf4j.LoggerFactory
;
import
org.slf4j.LoggerFactory
;
import
org.springframework.stereotype.Component
;
import
org.springframework.stereotype.Component
;
import
javax.print.Doc
;
import
java.io.IOException
;
import
java.io.IOException
;
import
java.net.URISyntaxException
;
import
java.net.URISyntaxException
;
import
java.util.HashMap
;
import
java.util.Map
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.TimeoutException
;
import
java.util.concurrent.TimeoutException
;
...
@@ -38,30 +48,10 @@ public class HmSpider implements IItemSpider {
...
@@ -38,30 +48,10 @@ public class HmSpider implements IItemSpider {
@Override
@Override
public
JSONObject
captureItem
(
String
targetUrl
)
throws
URISyntaxException
,
IOException
,
ExecutionException
,
InterruptedException
,
TimeoutException
{
public
JSONObject
captureItem
(
String
targetUrl
)
throws
URISyntaxException
,
IOException
,
ExecutionException
,
InterruptedException
,
TimeoutException
{
String
content
=
HttpClientUtil
.
getContentByUrl
(
targetUrl
,
PlatformEnum
.
HM
.
getValue
());
String
content
=
HttpClientUtil
.
getContentByUrl
(
targetUrl
,
PlatformEnum
.
HM
.
getValue
());
String
detailStr
=
JsoupUtil
.
getScriptContent
(
content
,
"productArticleDetails"
);
ProductResponse
productResponse
=
HMSpiderParse
.
formatProductResponse
(
content
);
int
firstBrackets
=
detailStr
.
indexOf
(
"{"
);
JSONObject
resultObj
=
JSONObject
.
fromObject
(
productResponse
);
int
lastbrackets
=
detailStr
.
lastIndexOf
(
"}"
);
TranslateHelper
.
translateProductResponse
(
resultObj
);
String
resultStr
=
detailStr
.
substring
(
firstBrackets
,
lastbrackets
+
1
);
int
firstImage
=
detailStr
.
indexOf
(
"'images':["
);
int
lastImage
=
detailStr
.
lastIndexOf
(
"'video':"
);
detailStr
=
detailStr
.
substring
(
firstImage
,
lastImage
);
resultStr
=
resultStr
.
replace
(
detailStr
,
""
);
JSONObject
resultObj
=
JSONObject
.
fromObject
(
resultStr
);
return
resultObj
;
return
resultObj
;
}
}
public
static
void
main
(
String
[]
args
)
throws
Exception
{
String
targetUrl
=
"https://www2.hm.com/zh_cn/productpage.0754698003.html"
;
String
content
=
HttpClientUtil
.
getContentByUrl
(
targetUrl
,
PlatformEnum
.
ZARA
.
getValue
());
String
detailStr
=
JsoupUtil
.
getScriptContent
(
content
,
"productArticleDetails"
);
int
firstBrackets
=
detailStr
.
indexOf
(
"{"
);
int
lastbrackets
=
detailStr
.
lastIndexOf
(
"}"
);
String
resultStr
=
detailStr
.
substring
(
firstBrackets
,
lastbrackets
+
1
);
resultStr
=
resultStr
.
replace
(
"isDesktop ? "
,
""
);
String
regexp
=
"\'"
;
resultStr
=
resultStr
.
replaceAll
(
regexp
,
"\""
);
JSONObject
resultObj
=
JSONObject
.
fromObject
(
resultStr
);
System
.
err
.
println
(
resultObj
);
}
}
}
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java
浏览文件 @
e91ff555
...
@@ -26,12 +26,12 @@ public class ZaraSpider implements IItemSpider {
...
@@ -26,12 +26,12 @@ public class ZaraSpider implements IItemSpider {
private
static
Logger
logger
=
LoggerFactory
.
getLogger
(
ZaraSpider
.
class
);
private
static
Logger
logger
=
LoggerFactory
.
getLogger
(
ZaraSpider
.
class
);
/**
/**
*
Massimo Dutti
商品详情页Url
*
Zara
商品详情页Url
*/
*/
private
static
final
String
ZARA_URL
=
"https://www.zara.cn/cn/zh/"
;
private
static
final
String
ZARA_URL
=
"https://www.zara.cn/cn/zh/"
;
/**
/**
*
Massimo Dutti
数据爬虫
*
Zara
数据爬虫
* @see com.diaoyun.zion.chinafrica.service.impl.SpiderServiceImpl# 修改商品详情页路径
* @see com.diaoyun.zion.chinafrica.service.impl.SpiderServiceImpl# 修改商品详情页路径
* @see ZaraSpiderParse#getJsonData 返回截取到的主要商品数据
* @see ZaraSpiderParse#getJsonData 返回截取到的主要商品数据
* @see ZaraSpiderParse#formatProductResponse 格式化数据方法
* @see ZaraSpiderParse#formatProductResponse 格式化数据方法
...
@@ -49,5 +49,4 @@ public class ZaraSpider implements IItemSpider {
...
@@ -49,5 +49,4 @@ public class ZaraSpider implements IItemSpider {
return
resultObj
;
return
resultObj
;
}
}
}
}
src/main/java/com/diaoyun/zion/chinafrica/enums/PlatformEnum.java
浏览文件 @
e91ff555
...
@@ -30,6 +30,8 @@ public enum PlatformEnum implements EnumItemable<PlatformEnum> {
...
@@ -30,6 +30,8 @@ public enum PlatformEnum implements EnumItemable<PlatformEnum> {
LEVI
(
"李维斯"
,
"levi"
),
LEVI
(
"李维斯"
,
"levi"
),
MOCO
(
"MO&Co."
,
"moco"
),
MOCO
(
"MO&Co."
,
"moco"
),
MASSIMODUTTI
(
"MassimoDutti"
,
"massimodutti"
),
MASSIMODUTTI
(
"MassimoDutti"
,
"massimodutti"
),
COACH
(
"蔻驰"
,
"coach"
),
VANS
(
"范斯"
,
"vans"
),
UN
(
"未知"
,
"un"
),
UN
(
"未知"
,
"un"
),
AfriEshop
(
"afri-eshop"
,
"afri-eshop"
);
AfriEshop
(
"afri-eshop"
,
"afri-eshop"
);
...
...
src/main/java/com/diaoyun/zion/chinafrica/factory/ItemSpiderFactory.java
浏览文件 @
e91ff555
...
@@ -87,6 +87,14 @@ public class ItemSpiderFactory {
...
@@ -87,6 +87,14 @@ public class ItemSpiderFactory {
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"massimoduttiSpider"
);
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"massimoduttiSpider"
);
break
;
break
;
}
}
case
"coach"
:
{
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"coachSpider"
);
break
;
}
case
"vans"
:
{
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"vansSpider"
);
break
;
}
case
"afri-eshop"
:{
case
"afri-eshop"
:{
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"africaShopItemSpider"
);
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"africaShopItemSpider"
);
break
;
break
;
...
...
src/main/java/com/diaoyun/zion/chinafrica/service/impl/SpiderServiceImpl.java
浏览文件 @
e91ff555
...
@@ -53,13 +53,13 @@ public class SpiderServiceImpl implements SpiderService {
...
@@ -53,13 +53,13 @@ public class SpiderServiceImpl implements SpiderService {
platformEnum
=
PlatformEnum
.
GAP
;
platformEnum
=
PlatformEnum
.
GAP
;
}
else
if
(
targetUrl
.
contains
(
"www.nike.com/cn/t/"
))
{
}
else
if
(
targetUrl
.
contains
(
"www.nike.com/cn/t/"
))
{
platformEnum
=
PlatformEnum
.
NIKE
;
platformEnum
=
PlatformEnum
.
NIKE
;
}
else
if
(
targetUrl
.
contains
(
"www.afri-eshop.com"
)
&&
targetUrl
.
contains
(
"/products/"
))
{
}
else
if
(
targetUrl
.
contains
(
"www.afri-eshop.com"
)
&&
targetUrl
.
contains
(
"/products/"
))
{
platformEnum
=
PlatformEnum
.
AfriEshop
;
platformEnum
=
PlatformEnum
.
AfriEshop
;
}
else
if
(
targetUrl
.
contains
(
"zara.cn"
))
{
}
else
if
(
targetUrl
.
contains
(
"zara.cn"
))
{
platformEnum
=
PlatformEnum
.
ZARA
;
platformEnum
=
PlatformEnum
.
ZARA
;
}
else
if
(
targetUrl
.
contains
(
"
h.uniqlo.cn/
"
))
{
}
else
if
(
targetUrl
.
contains
(
"
uniqlo"
)
&&
targetUrl
.
contains
(
"#/product?pid
"
))
{
platformEnum
=
PlatformEnum
.
UNIQLO
;
platformEnum
=
PlatformEnum
.
UNIQLO
;
}
else
if
(
targetUrl
.
contains
(
"hm.com/
zh_cn/
productpage"
))
{
}
else
if
(
targetUrl
.
contains
(
"hm.com/
"
)
&&
targetUrl
.
contains
(
"
productpage"
))
{
platformEnum
=
PlatformEnum
.
HM
;
platformEnum
=
PlatformEnum
.
HM
;
}
else
if
(
targetUrl
.
contains
(
"https://www.adidas.com.cn/item"
))
{
}
else
if
(
targetUrl
.
contains
(
"https://www.adidas.com.cn/item"
))
{
platformEnum
=
PlatformEnum
.
ADIDAS
;
platformEnum
=
PlatformEnum
.
ADIDAS
;
...
@@ -79,7 +79,12 @@ public class SpiderServiceImpl implements SpiderService {
...
@@ -79,7 +79,12 @@ public class SpiderServiceImpl implements SpiderService {
platformEnum
=
PlatformEnum
.
MOCO
;
platformEnum
=
PlatformEnum
.
MOCO
;
}
else
if
(
targetUrl
.
contains
(
"massimodutti.cn"
)
&&
targetUrl
.
contains
(
"colorId"
)
&&
targetUrl
.
contains
(
"categoryId"
))
{
}
else
if
(
targetUrl
.
contains
(
"massimodutti.cn"
)
&&
targetUrl
.
contains
(
"colorId"
)
&&
targetUrl
.
contains
(
"categoryId"
))
{
platformEnum
=
PlatformEnum
.
MASSIMODUTTI
;
platformEnum
=
PlatformEnum
.
MASSIMODUTTI
;
}
else
if
(
targetUrl
.
contains
(
"coach.com/coach"
))
{
platformEnum
=
PlatformEnum
.
COACH
;
}
else
if
(
targetUrl
.
contains
(
"vans.com"
)
&&
targetUrl
.
contains
(
"wap/product"
))
{
platformEnum
=
PlatformEnum
.
VANS
;
}
}
return
platformEnum
;
return
platformEnum
;
}
}
}
}
src/main/java/com/diaoyun/zion/master/util/spider/CoachSpiderParse.java
0 → 100644
浏览文件 @
e91ff555
package
com
.
diaoyun
.
zion
.
master
.
util
.
spider
;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.vo.*
;
import
net.sf.json.JSONArray
;
import
net.sf.json.JSONObject
;
import
java.util.*
;
import
java.util.regex.Pattern
;
import
static
com
.
diaoyun
.
zion
.
master
.
util
.
spider
.
SpiderUtil
.
exchangeRate
;
/**
* COACH(蔻驰) 爬虫数据解析
* @see com.diaoyun.zion.chinafrica.bis.impl.CoachSpider
* @author 爱酱油不爱醋
*/
public
class
CoachSpiderParse
{
/**
* 格式化返回数据
* @param dataMap 主要的Json数据
* @return 格式化后的数据
*/
public
static
ProductResponse
formatProductResponse
(
JSONObject
dataMap
,
String
pId
)
{
// 声明封装类
ProductResponse
productResponse
=
new
ProductResponse
();
// 属性:Zara 的商品属性有颜色、尺码
Map
<
String
,
Set
<
ProductProp
>>
productPropSet
=
new
HashMap
<>(
16
);
// 原始价
List
<
OriginalPrice
>
originalPriceList
=
new
ArrayList
<>();
// 促销价格
List
<
ProductPromotion
>
promotionList
=
new
ArrayList
<>();
// 库存
DynStock
dynStock
=
new
DynStock
();
// 其实数据没有包含确切的库存数,这里默认给足量的库存
dynStock
.
setSellableQuantity
(
9999
);
// 取 data 节点对象
JSONObject
dataObj
=
dataMap
.
getJSONObject
(
"data"
);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
ItemInfo
itemInfo
=
new
ItemInfo
();
itemInfo
.
setShopName
(
PlatformEnum
.
COACH
.
getLabel
());
itemInfo
.
setShopUrl
(
"https://china.coach.com"
);
itemInfo
.
setItemId
(
pId
);
itemInfo
.
setTitle
(
dataObj
.
getString
(
"name"
));
//////////////////////////////////// 获取商品基本信息(图片下取)End /////////////////////////
List
<
String
>
sizeNoList
=
new
ArrayList
<>();
List
<
String
>
colorNoList
=
new
ArrayList
<>();
// 取 attributes 节点数组
JSONArray
attributesArr
=
dataObj
.
getJSONArray
(
"attributes"
);
for
(
int
i
=
0
;
i
<
attributesArr
.
size
();
i
++)
{
///////////////////////// 获取商品颜色属性 ////////////////////////////////////////////////////////////////
// 0 位为颜色属性
if
(
i
==
0
)
{
// 取 values 节点数组
JSONArray
valuesArr
=
attributesArr
.
getJSONObject
(
i
).
getJSONArray
(
"values"
);
Set
<
ProductProp
>
propSet
=
new
HashSet
<>(
16
);
for
(
int
j
=
0
;
j
<
valuesArr
.
size
();
j
++)
{
JSONObject
valuesObj
=
valuesArr
.
getJSONObject
(
j
);
// 获取图片路径
String
imageUrl
=
valuesObj
.
getString
(
"image"
);
// 设置商品基本信息的图片
if
(
i
==
0
)
{
itemInfo
.
setPic
(
imageUrl
);
}
ProductProp
productPropColor
=
new
ProductProp
();
String
colorNo
=
valuesObj
.
getString
(
"value_index"
);
colorNoList
.
add
(
colorNo
);
productPropColor
.
setPropId
(
colorNo
);
productPropColor
.
setPropName
(
valuesObj
.
getString
(
"label"
));
productPropColor
.
setImage
(
imageUrl
);
propSet
.
add
(
productPropColor
);
if
(
productPropSet
.
get
(
"颜色"
)
==
null
)
{
productPropSet
.
put
(
"颜色"
,
propSet
);
}
else
{
Set
<
ProductProp
>
oldPropSet
=
productPropSet
.
get
(
"颜色"
);
propSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"颜色"
,
propSet
);
}
}
///////////////////////// 获取商品颜色属性End ////////////////////////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ////////////////////////////////////////////////////////////////
// 1 位为尺寸属性(有的商品不一定会存在,如手提包)
}
else
if
(
i
==
1
)
{
// 取 values 节点数组
JSONArray
valuesArr
=
attributesArr
.
getJSONObject
(
i
).
getJSONArray
(
"values"
);
Set
<
ProductProp
>
sizePropSet
=
new
HashSet
<>();
for
(
int
j
=
0
;
j
<
valuesArr
.
size
();
j
++)
{
JSONObject
valuesObj
=
valuesArr
.
getJSONObject
(
j
);
ProductProp
productPropSize
=
new
ProductProp
();
String
sizeNo
=
valuesObj
.
getString
(
"value_index"
);
productPropSize
.
setPropId
(
sizeNo
);
sizeNoList
.
add
(
sizeNo
);
productPropSize
.
setPropName
(
valuesObj
.
getString
(
"label"
));
sizePropSet
.
add
(
productPropSize
);
if
(
productPropSet
.
get
(
"尺码"
)
==
null
)
{
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
else
{
Set
<
ProductProp
>
oldPropSet
=
productPropSet
.
get
(
"尺码"
);
sizePropSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
}
}
}
for
(
String
colorNo
:
colorNoList
)
{
for
(
String
sizeNo
:
sizeNoList
)
{
// 设置 skuStr
String
skuStr
=
";"
+
colorNo
+
";"
+
sizeNo
+
";"
;
// 设置:商品包含库存信息
productResponse
.
setStockFlag
(
true
);
List
<
ProductSkuStock
>
productSkuStockList
=
dynStock
.
getProductSkuStockList
();
if
(
productSkuStockList
==
null
)
{
productSkuStockList
=
new
ArrayList
<>();
}
ProductSkuStock
productSkuStock
=
new
ProductSkuStock
();
// 设置:可用库存值,未有可用的库存数据
productSkuStock
.
setSellableQuantity
(
999
);
// 设置:库存对应的id
productSkuStock
.
setSkuStr
(
skuStr
);
productSkuStockList
.
add
(
productSkuStock
);
dynStock
.
setProductSkuStockList
(
productSkuStockList
);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
// 获取商品的原始价(存在优惠价格)
OriginalPrice
originalPrice
=
new
OriginalPrice
();
String
fullPrice
=
Pattern
.
compile
(
"[^0-9]"
).
matcher
(
dataObj
.
getString
(
"price"
)).
replaceAll
(
""
).
trim
();
// TODO 转换汇率,目前商品单位是人民币
fullPrice
=
exchangeRate
(
fullPrice
);
productResponse
.
setPrice
(
fullPrice
);
productResponse
.
setSalePrice
(
fullPrice
+
"-"
+
fullPrice
);
originalPrice
.
setPrice
(
fullPrice
);
originalPrice
.
setSkuStr
(
skuStr
);
originalPriceList
.
add
(
originalPrice
);
//////////////////////////////////// 获取原始价 END //////////////////////////////////
}
}
productResponse
.
setPropFlag
(
true
);
productResponse
.
setProductPropSet
(
productPropSet
);
productResponse
.
setPlatform
(
PlatformEnum
.
COACH
.
getValue
());
productResponse
.
setPromotionList
(
promotionList
);
productResponse
.
setOriginalPriceList
(
originalPriceList
);
productResponse
.
setItemInfo
(
itemInfo
);
productResponse
.
setDynStock
(
dynStock
);
return
productResponse
;
}
}
src/main/java/com/diaoyun/zion/master/util/spider/HMSpiderParse.java
0 → 100644
浏览文件 @
e91ff555
package
com
.
diaoyun
.
zion
.
master
.
util
.
spider
;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.vo.*
;
import
com.diaoyun.zion.master.util.JsoupUtil
;
import
net.sf.json.JSONArray
;
import
net.sf.json.JSONObject
;
import
org.jsoup.Jsoup
;
import
org.jsoup.nodes.Document
;
import
org.jsoup.nodes.Element
;
import
org.jsoup.select.Elements
;
import
java.math.BigDecimal
;
import
java.util.*
;
import
static
com
.
diaoyun
.
zion
.
master
.
util
.
spider
.
SpiderUtil
.
exchangeRate
;
/**
* H&M 爬虫数据解析
*
* @author 爱酱油不爱醋
*/
public
class
HMSpiderParse
{
/**
* 格式化返回数据
* @param content 页面数据
* @return 格式化后的数据
*/
public
static
ProductResponse
formatProductResponse
(
String
content
)
{
// 获取主要数据并将转换 Json 数据及 Document 对象
String
detailStr
=
JsoupUtil
.
getScriptContent
(
content
,
"productArticleDetails"
);
int
firstBrackets
=
detailStr
.
indexOf
(
"{"
);
int
lastbrackets
=
detailStr
.
lastIndexOf
(
"}"
);
String
resultStr
=
detailStr
.
substring
(
firstBrackets
,
lastbrackets
+
1
);
resultStr
=
resultStr
.
replaceAll
(
"\'"
,
"\""
)
.
replaceAll
(
"\"image\": isDesktop [?] "
,
""
)
.
replaceAll
(
"\"fullscreen\": isDesktop [?] "
,
""
)
.
replaceAll
(
"\"zoom\": isDesktop [?] "
,
""
);
JSONObject
dataMap
=
JSONObject
.
fromObject
(
resultStr
);
Document
document
=
Jsoup
.
parse
(
content
);
// 声明封装类
ProductResponse
productResponse
=
new
ProductResponse
();
// 属性:Zara 的商品属性有颜色、尺码
Map
<
String
,
Set
<
ProductProp
>>
productPropSet
=
new
HashMap
<>(
16
);
// 原始价
List
<
OriginalPrice
>
originalPriceList
=
new
ArrayList
<>();
// 促销价格
List
<
ProductPromotion
>
promotionList
=
new
ArrayList
<>();
// 库存
DynStock
dynStock
=
new
DynStock
();
// 其实数据没有包含确切的库存数,这里默认给足量的库存
dynStock
.
setSellableQuantity
(
9999
);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
ItemInfo
itemInfo
=
new
ItemInfo
();
itemInfo
.
setShopName
(
PlatformEnum
.
HM
.
getLabel
());
itemInfo
.
setShopUrl
(
"https://www2.hm.com/"
);
itemInfo
.
setItemId
(
document
.
select
(
"div[class=article-code]"
).
select
(
"li"
).
text
());
itemInfo
.
setTitle
(
document
.
select
(
"h1[class=primary product-item-headline]"
).
text
());
//////////////////////////////////// 获取商品基本信息(图片下取)End /////////////////////////
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
// 取页面的数据
Elements
colorEle
=
document
.
select
(
"div[class=mini-slider]"
).
select
(
"ul[class=inputlist clearfix]"
).
select
(
"li"
);
Map
<
String
,
String
>
colorMap
=
new
HashMap
<>(
16
);
Set
<
ProductProp
>
propSet
=
new
HashSet
<>(
16
);
for
(
Element
element
:
colorEle
)
{
ProductProp
productPropColor
=
new
ProductProp
();
String
color
=
element
.
select
(
"a"
).
attr
(
"data-color"
);
String
colorNo
=
element
.
select
(
"a"
).
attr
(
"data-articlecode"
);
String
imgUrl
=
element
.
select
(
"noscript"
).
attr
(
"data-src"
);
colorMap
.
put
(
colorNo
,
color
);
itemInfo
.
setPic
(
imgUrl
);
productPropColor
.
setPropId
(
colorNo
);
productPropColor
.
setPropName
(
color
);
productPropColor
.
setImage
(
imgUrl
);
propSet
.
add
(
productPropColor
);
if
(
productPropSet
.
get
(
"颜色"
)
==
null
)
{
productPropSet
.
put
(
"颜色"
,
propSet
);
}
else
{
Set
<
ProductProp
>
oldPropSet
=
productPropSet
.
get
(
"颜色"
);
propSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"颜色"
,
propSet
);
}
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////////
Map
<
String
,
String
>
sizeMap
=
new
HashMap
<>(
16
);
Map
<
String
,
Map
<
String
,
String
>>
colorHaveSizeMap
=
new
HashMap
<>(
16
);
Set
<
ProductProp
>
sizePropSet
=
new
HashSet
<>();
for
(
Map
.
Entry
<
String
,
String
>
entry
:
colorMap
.
entrySet
())
{
JSONArray
sizeArr
=
dataMap
.
getJSONObject
(
entry
.
getKey
()).
getJSONArray
(
"sizes"
);
for
(
int
i
=
0
;
i
<
sizeArr
.
size
();
i
++)
{
JSONObject
sizeObj
=
sizeArr
.
getJSONObject
(
i
);
ProductProp
productPropSize
=
new
ProductProp
();
String
sizeNo
=
sizeObj
.
getString
(
"sizeCode"
);
String
size
=
sizeObj
.
getString
(
"name"
);
sizeMap
.
put
(
sizeNo
,
size
);
productPropSize
.
setPropId
(
sizeNo
);
productPropSize
.
setPropName
(
size
);
sizePropSet
.
add
(
productPropSize
);
if
(
productPropSet
.
get
(
"尺码"
)
==
null
)
{
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
else
{
Set
<
ProductProp
>
oldPropSet
=
productPropSet
.
get
(
"尺码"
);
sizePropSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
}
colorHaveSizeMap
.
put
(
entry
.
getKey
(),
sizeMap
);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
productResponse
.
setStockFlag
(
true
);
List
<
ProductSkuStock
>
productSkuStockList
=
dynStock
.
getProductSkuStockList
();
// 获取价格
String
fullPrice
=
document
.
select
(
"div[class=primary-row product-item-price]"
).
text
();
fullPrice
=
SpiderUtil
.
retainNumber
(
fullPrice
);
// TODO 转换汇率,目前商品单位是人民币
fullPrice
=
exchangeRate
(
fullPrice
);
BigDecimal
priceOld
=
new
BigDecimal
(
fullPrice
);
BigDecimal
div
=
new
BigDecimal
(
"100"
);
fullPrice
=
priceOld
.
divide
(
div
,
2
,
BigDecimal
.
ROUND_DOWN
).
toString
();
if
(
productSkuStockList
==
null
)
{
productSkuStockList
=
new
ArrayList
<>();
}
for
(
Map
.
Entry
<
String
,
Map
<
String
,
String
>>
colorEntry
:
colorHaveSizeMap
.
entrySet
())
{
for
(
Map
.
Entry
<
String
,
String
>
sizeEntry
:
sizeMap
.
entrySet
())
{
// 设置 skuStr
String
skuStr
=
";"
+
colorEntry
.
getKey
()
+
";"
+
sizeEntry
.
getKey
()
+
";"
;
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// H&M 尚未可用的库存
ProductSkuStock
productSkuStock
=
new
ProductSkuStock
();
productSkuStock
.
setSkuStr
(
skuStr
);
productSkuStock
.
setSellableQuantity
(
999
);
productSkuStockList
.
add
(
productSkuStock
);
dynStock
.
setProductSkuStockList
(
productSkuStockList
);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice
originalPrice
=
new
OriginalPrice
();
originalPrice
.
setSkuStr
(
skuStr
);
originalPrice
.
setPrice
(
fullPrice
);
originalPriceList
.
add
(
originalPrice
);
productResponse
.
setPrice
(
fullPrice
);
productResponse
.
setSalePrice
(
fullPrice
+
"-"
+
fullPrice
);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse
.
setPropFlag
(
true
);
productResponse
.
setProductPropSet
(
productPropSet
);
productResponse
.
setPlatform
(
PlatformEnum
.
HM
.
getValue
());
productResponse
.
setPromotionList
(
promotionList
);
productResponse
.
setOriginalPriceList
(
originalPriceList
);
productResponse
.
setItemInfo
(
itemInfo
);
productResponse
.
setDynStock
(
dynStock
);
return
productResponse
;
}
}
src/main/java/com/diaoyun/zion/master/util/spider/LeviSpiderParse.java
浏览文件 @
e91ff555
...
@@ -96,8 +96,8 @@ public class LeviSpiderParse {
...
@@ -96,8 +96,8 @@ public class LeviSpiderParse {
sizePropSet
.
addAll
(
oldPropSet
);
sizePropSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
}
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
for
(
String
colorNo
:
colorNoList
)
{
for
(
String
colorNo
:
colorNoList
)
{
for
(
String
sizeNo
:
sizeNoList
)
{
for
(
String
sizeNo
:
sizeNoList
)
{
...
...
src/main/java/com/diaoyun/zion/master/util/spider/SpiderUtil.java
浏览文件 @
e91ff555
...
@@ -13,11 +13,13 @@ import org.apache.commons.lang3.StringUtils;
...
@@ -13,11 +13,13 @@ import org.apache.commons.lang3.StringUtils;
import
java.math.BigDecimal
;
import
java.math.BigDecimal
;
import
java.util.*
;
import
java.util.*
;
import
java.util.regex.Pattern
;
public
class
SpiderUtil
{
public
class
SpiderUtil
{
private
static
BigDecimal
rate
;
private
static
BigDecimal
rate
;
static
{
static
{
TbCfFeeService
tbCfFeeService
=
(
TbCfFeeService
)
SpringContextUtil
.
getBean
(
"tbCfFeeService"
);
TbCfFeeService
tbCfFeeService
=
(
TbCfFeeService
)
SpringContextUtil
.
getBean
(
"tbCfFeeService"
);
rate
=
tbCfFeeService
.
getRateFee
().
getFeeRate
();
rate
=
tbCfFeeService
.
getRateFee
().
getFeeRate
();
...
@@ -36,6 +38,16 @@ public class SpiderUtil {
...
@@ -36,6 +38,16 @@ public class SpiderUtil {
return
new
BigDecimal
(
fullPrice
).
divide
(
rate
,
2
,
BigDecimal
.
ROUND_UP
).
toString
();
return
new
BigDecimal
(
fullPrice
).
divide
(
rate
,
2
,
BigDecimal
.
ROUND_UP
).
toString
();
}
}
/**
* 去除除了数字之外的所有字符
* @param str 字符串
* @return 只有数字的字符串
*/
public
static
String
retainNumber
(
String
str
)
{
str
=
Pattern
.
compile
(
"[^0-9]"
).
matcher
(
str
).
replaceAll
(
""
).
trim
();
return
str
;
}
/**
/**
* 格式化 gap 返回数据
* 格式化 gap 返回数据
*
*
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论