Skip to content
项目
群组
代码片段
帮助
正在加载...
帮助
为 GitLab 提交贡献
登录/注册
切换导航
Z
zion
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
分枝图
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
1
合并请求
1
CI / CD
CI / CD
流水线
作业
计划
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
分枝图
统计图
创建新议题
作业
提交
议题看板
打开侧边栏
zhengfg
zion
Commits
da07cbe8
提交
da07cbe8
authored
11月 11, 2019
作者:
梁业锦
💬
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
增加了Stradivarius与ZaraHome的爬虫,完善了其他爬虫的代码结构,提高阅读性。
上级
a508f493
显示空白字符变更
内嵌
并排
正在显示
15 个修改的文件
包含
472 行增加
和
86 行删除
+472
-86
SpiderSpecification.md
doc/SpiderSpecification.md
+17
-38
ConverseSpider.java
.../com/diaoyun/zion/chinafrica/bis/impl/ConverseSpider.java
+16
-0
FendiSpider.java
...ava/com/diaoyun/zion/chinafrica/bis/impl/FendiSpider.java
+5
-0
GucciSpider.java
...ava/com/diaoyun/zion/chinafrica/bis/impl/GucciSpider.java
+2
-1
HmSpider.java
...n/java/com/diaoyun/zion/chinafrica/bis/impl/HmSpider.java
+3
-3
MajeSpider.java
...java/com/diaoyun/zion/chinafrica/bis/impl/MajeSpider.java
+14
-9
OyshoSpider.java
...ava/com/diaoyun/zion/chinafrica/bis/impl/OyshoSpider.java
+1
-1
PradaSpider.java
...ava/com/diaoyun/zion/chinafrica/bis/impl/PradaSpider.java
+3
-0
StradivariusSpider.java
.../diaoyun/zion/chinafrica/bis/impl/StradivariusSpider.java
+181
-0
UniqloSpider.java
...va/com/diaoyun/zion/chinafrica/bis/impl/UniqloSpider.java
+0
-1
ZaraHomeSpider.java
.../com/diaoyun/zion/chinafrica/bis/impl/ZaraHomeSpider.java
+187
-0
ZaraSpider.java
...java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java
+34
-33
PlatformEnum.java
.../java/com/diaoyun/zion/chinafrica/enums/PlatformEnum.java
+1
-0
ItemSpiderFactory.java
...om/diaoyun/zion/chinafrica/factory/ItemSpiderFactory.java
+4
-0
SpiderServiceImpl.java
...aoyun/zion/chinafrica/service/impl/SpiderServiceImpl.java
+4
-0
没有找到文件。
doc/SpiderSpecification.md
浏览文件 @
da07cbe8
...
@@ -26,31 +26,23 @@
...
@@ -26,31 +26,23 @@
-
主页:https://www.pullandbear.cn/cn/%E5%A5%B3%E5%A3%AB-c1030204574.html
-
主页:https://www.pullandbear.cn/cn/%E5%A5%B3%E5%A3%AB-c1030204574.html
-
命名:pullandbear
-
命名:pullandbear
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
有反爬机制,有时会直接失效,不稳定
-
缺陷:尺码重叠
-
缺陷:
-
颜色款式数据有误
-
尺码未对应样式
### 2.[Gap](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/GapItemSpider.java)
### 2.[Gap](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/GapItemSpider.java)
-
主页:https://www.gap.cn/
-
主页:https://www.gap.cn/
-
命名:gap
-
命名:gap
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
失效,无法爬取数据
### 3.[Zara](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java)
### 3.[Zara](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java)
-
主页:https://www.zara.cn/cn
-
主页:https://www.zara.cn/cn
-
命名:zara
-
命名:zara
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
可能存在的缺陷:
### 4.[Uniqlo(优衣库)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UniqloSpider.java)
### 4.[Uniqlo(优衣库)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UniqloSpider.java)
-
主页:https://www.uniqlo.cn/UNIQLO_U19FW_MEN.html
-
主页:https://www.uniqlo.cn/UNIQLO_U19FW_MEN.html
-
命名:uniqlo
-
命名:uniqlo
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
失效
-
失效:app无法抓取到商品详情页链接
-
链接做了反爬处理
-
可能存在的缺陷:
-
图片的路径是直接下载图片
### 5.[Nike(耐克)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/NikeItemSpider.java)
### 5.[Nike(耐克)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/NikeItemSpider.java)
-
主页:https://www.nike.com/cn
-
主页:https://www.nike.com/cn
...
@@ -61,13 +53,12 @@
...
@@ -61,13 +53,12 @@
-
主页:https://www.adidas.com.cn/
-
主页:https://www.adidas.com.cn/
-
命名:adidas
-
命名:adidas
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
可用但存在的缺陷:
-
商品尺码不对应
### 7.[H&M](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/HmSpider.java)
### 7.[H&M](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/HmSpider.java)
-
主页:https://www2.hm.com/zh_cn/
-
主页:https://www2.hm.com/zh_cn/
-
命名:hm
-
命名:hm
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
缺陷:尺码重叠
### 8.LiLy
### 8.LiLy
-
主页:http://www.lily.sh.cn/webapp/wcs/stores/servlet/lilystore
-
主页:http://www.lily.sh.cn/webapp/wcs/stores/servlet/lilystore
...
@@ -85,25 +76,18 @@
...
@@ -85,25 +76,18 @@
-
主页:http://www.ur.cn/index.html
-
主页:http://www.ur.cn/index.html
-
命名:urbanrevivo
-
命名:urbanrevivo
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
数据来源:
-
商品案例:http://wap.ur.com.cn/product/detail?productColorId=ff8080816dbb693e016dfd58f27c45d9
-
数据接口:http://wap.ur.com.cn/product/product/detail?id=ff8080816dbb693e016dfd58f27c45d9
-
可用但存在的缺陷:
### 11.[Aber Crombie & Fitch](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/AberCrombieFitchSpider.java)
### 11.[Aber Crombie & Fitch](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/AberCrombieFitchSpider.java)
-
主页:https://www.abercrombie.cn/zh_CN/home
-
主页:https://www.abercrombie.cn/zh_CN/home
-
命名:abercrombie
-
命名:abercrombie
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
有反向代理的反爬机制,暂留破解
-
失效:有反向代理的反爬机制,无法获取数据
### 12.[Under Armour(安德玛)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UnderArmourSpider.java)
### 12.[Under Armour(安德玛)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UnderArmourSpider.java)
-
主页:https://www.underarmour.cn/
-
主页:https://www.underarmour.cn/
-
命名:ur
-
命名:ur
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
可用但存在的缺陷:
-
缺陷:效率太慢
-
效率太慢
-
主图失效
-
尺码不对应库存
### 13.Converse(匡威)
### 13.Converse(匡威)
-
主页:https://www.converse.com.cn/
-
主页:https://www.converse.com.cn/
...
@@ -115,18 +99,18 @@
...
@@ -115,18 +99,18 @@
-
主页:http://www.ochirly.com.cn/SALE/list.shtml
-
主页:http://www.ochirly.com.cn/SALE/list.shtml
-
命名:ochirly
-
命名:ochirly
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
可用但存在的缺陷:
### 15.[Esprit(埃斯普利特)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/EspritSpider.java)
### 15.[Esprit(埃斯普利特)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/EspritSpider.java)
-
主页:https://www.esprit.cn/
-
主页:https://www.esprit.cn/
-
命名:esprit
-
命名:esprit
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
App爬取数据失效
-
失效:app无法抓取到商品详情页链接
### 16.[Levi(李维斯)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/LeviSpider.java)
### 16.[Levi(李维斯)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/LeviSpider.java)
-
主页:https://www.levi.com.cn/
-
主页:https://www.levi.com.cn/
-
命名:levi
-
命名:levi
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
App爬取数据失效
-
缺陷:尺码重叠
### 17.[MO&Co.(摩安珂)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/MocoSpider.java)
### 17.[MO&Co.(摩安珂)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/MocoSpider.java)
-
主页:https://www.moco.com/moco/zh/c/BS_DISCOUNT
-
主页:https://www.moco.com/moco/zh/c/BS_DISCOUNT
...
@@ -139,41 +123,35 @@
...
@@ -139,41 +123,35 @@
-
主页:https://www.massimodutti.cn/cn/男装/季末折扣/休闲西装-c1745921.html
-
主页:https://www.massimodutti.cn/cn/男装/季末折扣/休闲西装-c1745921.html
-
命名:massimodutti
-
命名:massimodutti
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
失效
-
失效:app无法抓取到商品详情页链接
-
链接做了反爬处理
-
数据来源
-
商品详情:https://www.massimodutti.cn/cn/%E5%A5%B3%E8%A3%85/%E7%B3%BB%E5%88%97/%E8%A1%AC%E8%A1%AB%E5%92%8C%E7%BD%A9%E8%A1%AB/%E8%A1%AC%E8%A1%AB/%E6%BB%91%E9%9B%AA%E9%A3%8E%E7%B3%BB%E5%88%97%E9%A5%B0%E5%8F%A3%E8%A2%8B%E8%A1%AC%E8%A1%AB-c1718602p8730105.html?colorId=420&categoryId=1718602
-
数据接口:https://www.massimodutti.cn/itxrest/2/catalog/store/35009478/30359500/category/0/product/8730105/detail?languageId=-7&appId=1
### 19.[COACH(蔻驰)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/CoachSpider.java)
### 19.[COACH(蔻驰)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/CoachSpider.java)
-
主页:https://china.coach.com/women.html
-
主页:https://china.coach.com/women.html
-
命名:coach
-
命名:coach
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
数据来源
-
失效:存在BUG
-
商品详情:https://china.coach.com/coach-essentials-oversize-cardigan/69007_LPK.html?c=8664
-
数据接口:https://china.coach.com/rest/default/V1/applet/product/CONF69007_LPK
-
存在缺陷:还需要判断是否存在颜色或尺寸的数据
### 20.[Revolve](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/RevolveSpider.java)
### 20.[Revolve](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/RevolveSpider.java)
-
主页:https://www.revolve.com/wrangler/br/57f1a1/?utm_source=baidu&utm_medium=cpc&utm_campaign=intl_P_cn-d-Wrangler
-
主页:https://www.revolve.com/wrangler/br/57f1a1/?utm_source=baidu&utm_medium=cpc&utm_campaign=intl_P_cn-d-Wrangler
-
命名:reolve
-
命名:reolve
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
失效:app无法抓取到商品详情页链接
### 21.Vans(范斯)
### 21.Vans(范斯)
-
主页:https://vans.com.cn/gallery-index---0---36.html
-
主页:https://vans.com.cn/gallery-index---0---36.html
-
命名:Vans
-
命名:Vans
-
爬虫进度:
-
爬虫进度:
### 22.
ZaraHome
### 22.
[ZaraHome](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraHomeSpider.java)
-
主页:https://zarahome.tmall.com/?spm=a1z10.3-b-s.1997427721.d4918089.7b872e00zWrHhi
-
主页:https://zarahome.tmall.com/?spm=a1z10.3-b-s.1997427721.d4918089.7b872e00zWrHhi
-
命名:
-
命名:
zaraHome
-
爬虫进度:
天猫代理网站
-
爬虫进度:
**已完成**
### 23.[Oysho](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/OyshoSpider.java)
### 23.[Oysho](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/OyshoSpider.java)
-
主页:https://oysho.tmall.com/ (SPORT WEAR)
-
主页:https://oysho.tmall.com/ (SPORT WEAR)
-
命名:
-
命名:
-
爬虫进度:
**已完成**
-
爬虫进度:
**已完成**
-
优化处理链接的商品 id
### 24.[Stradivarius(斯特拉迪瓦里斯)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/StradivariusSpider.java)
### 24.[Stradivarius(斯特拉迪瓦里斯)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/StradivariusSpider.java)
-
主页:https://www.stradivarius.cn/cn/
-
主页:https://www.stradivarius.cn/cn/
...
@@ -203,7 +181,8 @@
...
@@ -203,7 +181,8 @@
### 29.[Fendi(芬迪)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/FendiSpider.java)
### 29.[Fendi(芬迪)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/FendiSpider.java)
-
主页:https://www.fendi.cn/?utm_source=Baidu&utm_medium=PC&utm_campaign=NewBrand%20Pure&utm_content=B_Site
-
主页:https://www.fendi.cn/?utm_source=Baidu&utm_medium=PC&utm_campaign=NewBrand%20Pure&utm_content=B_Site
-
命名:fendi
-
命名:fendi
-
爬虫进度:
-
爬虫进度:
**已完成**
-
缺陷:多颜色下无法爬取数据
### 30.HuaWei(华为)
### 30.HuaWei(华为)
-
主页:https://www.vmall.com/huawei?cid=78140
-
主页:https://www.vmall.com/huawei?cid=78140
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ConverseSpider.java
0 → 100644
浏览文件 @
da07cbe8
package
com
.
diaoyun
.
zion
.
chinafrica
.
bis
.
impl
;
/**
* Converse 数据爬虫
*
* @author 爱酱油不爱醋
*/
public
class
ConverseSpider
{
// public static void main(String[] args) throws Exception {
// String targetUrl = "https://m.converse.com.cn/inventory/168131C110";
// String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.ZARAHOME.getValue());
// System.err.println(content);
// }
}
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/FendiSpider.java
浏览文件 @
da07cbe8
...
@@ -47,6 +47,9 @@ public class FendiSpider implements IItemSpider {
...
@@ -47,6 +47,9 @@ public class FendiSpider implements IItemSpider {
/**
/**
* 格式化返回数据
* 格式化返回数据
*
* TODO 存在一个是否为 JSONObject 或 JSONArray 的判断
*
* @param content 主要的网页内容
* @param content 主要的网页内容
* @return 格式化后的数据
* @return 格式化后的数据
*/
*/
...
@@ -86,6 +89,8 @@ public class FendiSpider implements IItemSpider {
...
@@ -86,6 +89,8 @@ public class FendiSpider implements IItemSpider {
itemInfo
.
setTitle
(
document
.
select
(
"div[class=info__summary]"
).
text
().
trim
());
itemInfo
.
setTitle
(
document
.
select
(
"div[class=info__summary]"
).
text
().
trim
());
//////////////////////////////////// 获取商品基本信息End ///////////////////////////////////////////////
//////////////////////////////////// 获取商品基本信息End ///////////////////////////////////////////////
System
.
err
.
println
(
pUrlObj
);
JSONArray
dataArr
=
pUrlObj
.
getJSONObject
(
"magento_api"
).
getJSONObject
(
"data"
).
getJSONArray
(
"data_item"
);
JSONArray
dataArr
=
pUrlObj
.
getJSONObject
(
"magento_api"
).
getJSONObject
(
"data"
).
getJSONArray
(
"data_item"
);
for
(
int
i
=
0
;
i
<
dataArr
.
size
();
i
++)
{
for
(
int
i
=
0
;
i
<
dataArr
.
size
();
i
++)
{
JSONObject
dataObj
=
dataArr
.
getJSONObject
(
i
);
JSONObject
dataObj
=
dataArr
.
getJSONObject
(
i
);
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/GucciSpider.java
浏览文件 @
da07cbe8
...
@@ -19,12 +19,13 @@ import java.net.URISyntaxException;
...
@@ -19,12 +19,13 @@ import java.net.URISyntaxException;
import
java.util.*
;
import
java.util.*
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.TimeoutException
;
import
java.util.concurrent.TimeoutException
;
import
java.util.regex.Pattern
;
import
static
com
.
diaoyun
.
zion
.
master
.
util
.
SpiderUtil
.
exchangeRate
;
import
static
com
.
diaoyun
.
zion
.
master
.
util
.
SpiderUtil
.
exchangeRate
;
/**
/**
* Gucci(古驰) 数据爬虫
* Gucci(古驰) 数据爬虫
* TODO 数据显示存在BUG
* @author 爱酱油不爱醋
*/
*/
@Component
(
"gucciSpider"
)
@Component
(
"gucciSpider"
)
public
class
GucciSpider
implements
IItemSpider
{
public
class
GucciSpider
implements
IItemSpider
{
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/HmSpider.java
浏览文件 @
da07cbe8
...
@@ -53,7 +53,7 @@ public class HmSpider implements IItemSpider {
...
@@ -53,7 +53,7 @@ public class HmSpider implements IItemSpider {
/**
/**
* 格式化返回数据
* 格式化返回数据
*
*
TODO 存在把不在页面上显示的颜色的尺码也算了进去
* @param content 页面数据
* @param content 页面数据
* @return 格式化后的数据
* @return 格式化后的数据
*/
*/
...
@@ -68,7 +68,6 @@ public class HmSpider implements IItemSpider {
...
@@ -68,7 +68,6 @@ public class HmSpider implements IItemSpider {
.
replaceAll
(
"\"fullscreen\": isDesktop [?] "
,
""
)
.
replaceAll
(
"\"fullscreen\": isDesktop [?] "
,
""
)
.
replaceAll
(
"\"zoom\": isDesktop [?] "
,
""
)
.
replaceAll
(
"\"zoom\": isDesktop [?] "
,
""
)
.
replaceAll
(
"isDesktop [?] \"//www2.hm.com/\" : "
,
""
);
.
replaceAll
(
"isDesktop [?] \"//www2.hm.com/\" : "
,
""
);
System
.
err
.
println
(
resultStr
);
JSONObject
dataMap
=
JSONObject
.
fromObject
(
resultStr
);
JSONObject
dataMap
=
JSONObject
.
fromObject
(
resultStr
);
Document
document
=
Jsoup
.
parse
(
content
);
Document
document
=
Jsoup
.
parse
(
content
);
...
@@ -133,12 +132,13 @@ public class HmSpider implements IItemSpider {
...
@@ -133,12 +132,13 @@ public class HmSpider implements IItemSpider {
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////////
// TODO 这里好像出了点问题。。。
JSONArray
sizeArr
=
dataMap
.
getJSONObject
(
colorNo
).
getJSONArray
(
"sizes"
);
JSONArray
sizeArr
=
dataMap
.
getJSONObject
(
colorNo
).
getJSONArray
(
"sizes"
);
for
(
int
i
=
0
;
i
<
sizeArr
.
size
();
i
++)
{
for
(
int
i
=
0
;
i
<
sizeArr
.
size
();
i
++)
{
JSONObject
sizeObj
=
sizeArr
.
getJSONObject
(
i
);
JSONObject
sizeObj
=
sizeArr
.
getJSONObject
(
i
);
String
sizeNo
=
sizeObj
.
getString
(
"sizeCode"
);
String
size
=
sizeObj
.
getString
(
"name"
);
String
size
=
sizeObj
.
getString
(
"name"
);
String
sizeNo
=
sizeObj
.
getString
(
"sizeCode"
);
ProductProp
productPropSize
=
new
ProductProp
();
ProductProp
productPropSize
=
new
ProductProp
();
productPropSize
.
setPropId
(
sizeNo
);
productPropSize
.
setPropId
(
sizeNo
);
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/MajeSpider.java
浏览文件 @
da07cbe8
...
@@ -4,8 +4,8 @@ import com.diaoyun.zion.chinafrica.bis.IItemSpider;
...
@@ -4,8 +4,8 @@ import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.vo.*
;
import
com.diaoyun.zion.chinafrica.vo.*
;
import
com.diaoyun.zion.master.util.HttpClientUtil
;
import
com.diaoyun.zion.master.util.HttpClientUtil
;
import
com.diaoyun.zion.master.util.TranslateHelper
;
import
com.diaoyun.zion.master.util.SpiderUtil
;
import
com.diaoyun.zion.master.util.SpiderUtil
;
import
com.diaoyun.zion.master.util.TranslateHelper
;
import
net.sf.json.JSONObject
;
import
net.sf.json.JSONObject
;
import
org.jsoup.Jsoup
;
import
org.jsoup.Jsoup
;
import
org.jsoup.nodes.Document
;
import
org.jsoup.nodes.Document
;
...
@@ -20,7 +20,6 @@ import java.net.URISyntaxException;
...
@@ -20,7 +20,6 @@ import java.net.URISyntaxException;
import
java.util.*
;
import
java.util.*
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.TimeoutException
;
import
java.util.concurrent.TimeoutException
;
import
java.util.regex.Pattern
;
/**
/**
* MajeSpider 数据爬虫
* MajeSpider 数据爬虫
...
@@ -73,7 +72,7 @@ public class MajeSpider implements IItemSpider {
...
@@ -73,7 +72,7 @@ public class MajeSpider implements IItemSpider {
Document
document
=
Jsoup
.
parse
(
content
);
Document
document
=
Jsoup
.
parse
(
content
);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
// itemInfo.setItemId(document.select(""
));
itemInfo
.
setItemId
(
document
.
select
(
"span[class=breadcrumb-last]"
).
text
(
));
itemInfo
.
setShopName
(
"Maje"
);
itemInfo
.
setShopName
(
"Maje"
);
itemInfo
.
setShopUrl
(
"https://www.maje.cn/"
);
itemInfo
.
setShopUrl
(
"https://www.maje.cn/"
);
itemInfo
.
setTitle
(
document
.
select
(
"meta[property=og:title]"
).
attr
(
"content"
));
itemInfo
.
setTitle
(
document
.
select
(
"meta[property=og:title]"
).
attr
(
"content"
));
...
@@ -85,17 +84,23 @@ public class MajeSpider implements IItemSpider {
...
@@ -85,17 +84,23 @@ public class MajeSpider implements IItemSpider {
Elements
pContentEle
=
document
.
select
(
"div[id=product-content]"
).
select
(
"ul[class=dropdown-content]"
);
Elements
pContentEle
=
document
.
select
(
"div[id=product-content]"
).
select
(
"ul[class=dropdown-content]"
);
Elements
colorsEle
=
pContentEle
.
select
(
"ul[class=swatches Color]"
).
select
(
"a"
);
Elements
colorsEle
=
pContentEle
.
select
(
"ul[class=swatches Color]"
).
select
(
"a"
);
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
for
(
Element
colorEle
:
colorsEle
)
{
for
(
int
i
=
0
;
i
<
colorsEle
.
size
();
i
++)
{
String
dataIgimg
=
colorsEle
.
get
(
i
).
attr
(
"data-lgimg"
);
JSONObject
dataIgimgObj
=
JSONObject
.
fromObject
(
dataIgimg
);
String
colorNo
=
colorEle
.
attr
(
"data-variationparameter"
);
String
colorNo
=
colorsEle
.
get
(
i
).
attr
(
"data-variationparameter"
);
String
color
=
colorEle
.
attr
(
"title"
);
String
color
=
colorsEle
.
get
(
i
).
attr
(
"title"
);
String
imgUrl
=
dataIgimgObj
.
getString
(
"url"
);
// TODO 图片路径未处理
if
(
i
==
0
)
{
itemInfo
.
setPic
(
imgUrl
);
}
ProductProp
productPropColor
=
new
ProductProp
();
ProductProp
productPropColor
=
new
ProductProp
();
productPropColor
.
setPropId
(
colorNo
);
productPropColor
.
setPropId
(
colorNo
);
productPropColor
.
setPropName
(
color
);
productPropColor
.
setPropName
(
color
);
//
productPropColor.setImage(imgUrl);
productPropColor
.
setImage
(
imgUrl
);
propSet
.
add
(
productPropColor
);
propSet
.
add
(
productPropColor
);
if
(
productPropSet
.
get
(
"颜色"
)
==
null
)
{
if
(
productPropSet
.
get
(
"颜色"
)
==
null
)
{
productPropSet
.
put
(
"颜色"
,
propSet
);
productPropSet
.
put
(
"颜色"
,
propSet
);
...
@@ -108,7 +113,6 @@ public class MajeSpider implements IItemSpider {
...
@@ -108,7 +113,6 @@ public class MajeSpider implements IItemSpider {
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////
Elements
sizesEle
=
pContentEle
.
select
(
"ul[class=swatches size]"
).
select
(
"a"
);
Elements
sizesEle
=
pContentEle
.
select
(
"ul[class=swatches size]"
).
select
(
"a"
);
for
(
Element
sizeEle
:
sizesEle
)
{
for
(
Element
sizeEle
:
sizesEle
)
{
String
sizeNo
=
sizeEle
.
attr
(
"data-variationparameter"
);
String
sizeNo
=
sizeEle
.
attr
(
"data-variationparameter"
);
...
@@ -156,6 +160,7 @@ public class MajeSpider implements IItemSpider {
...
@@ -156,6 +160,7 @@ public class MajeSpider implements IItemSpider {
//////////////////////////////////// 获取库存与原始价 END///////////////////////////////
//////////////////////////////////// 获取库存与原始价 END///////////////////////////////
}
}
}
}
productResponse
.
setProductPropSet
(
productPropSet
);
productResponse
.
setProductPropSet
(
productPropSet
);
productResponse
.
setPlatform
(
"Maje"
);
productResponse
.
setPlatform
(
"Maje"
);
productResponse
.
setPromotionList
(
promotionList
);
productResponse
.
setPromotionList
(
promotionList
);
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/OyshoSpider.java
浏览文件 @
da07cbe8
...
@@ -21,7 +21,7 @@ import java.util.concurrent.TimeoutException;
...
@@ -21,7 +21,7 @@ import java.util.concurrent.TimeoutException;
/**
/**
* Oysho 数据爬虫
* Oysho 数据爬虫
*
*
TODO 抓取不到商品详情页的链接
* @author 爱酱油不爱醋
* @author 爱酱油不爱醋
*/
*/
@Component
(
"oyshoSpider"
)
@Component
(
"oyshoSpider"
)
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/PradaSpider.java
浏览文件 @
da07cbe8
...
@@ -46,6 +46,9 @@ public class PradaSpider implements IItemSpider {
...
@@ -46,6 +46,9 @@ public class PradaSpider implements IItemSpider {
/**
/**
* 格式化返回数据
* 格式化返回数据
*
* TODO 存在无法爬取数据的问题
*
* @param content 主要的页面数据
* @param content 主要的页面数据
* @return 格式化后的数据
* @return 格式化后的数据
*/
*/
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/StradivariusSpider.java
0 → 100644
浏览文件 @
da07cbe8
package
com
.
diaoyun
.
zion
.
chinafrica
.
bis
.
impl
;
import
com.diaoyun.zion.chinafrica.bis.IItemSpider
;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.vo.*
;
import
com.diaoyun.zion.master.util.HttpClientUtil
;
import
com.diaoyun.zion.master.util.TranslateHelper
;
import
net.sf.json.JSONArray
;
import
net.sf.json.JSONObject
;
import
org.springframework.stereotype.Component
;
import
java.io.IOException
;
import
java.math.BigDecimal
;
import
java.net.URISyntaxException
;
import
java.util.*
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.TimeoutException
;
import
static
com
.
diaoyun
.
zion
.
master
.
util
.
SpiderUtil
.
exchangeRate
;
/**
* Stradivarius 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component
(
"stradivariusSpider"
)
public
class
StradivariusSpider
implements
IItemSpider
{
/**
* Stradivarius 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public
JSONObject
captureItem
(
String
targetUrl
)
throws
URISyntaxException
,
IOException
,
ExecutionException
,
InterruptedException
,
TimeoutException
{
String
[]
spilt
=
targetUrl
.
split
(
"p"
);
spilt
=
spilt
[
2
].
split
(
".html"
);
String
pId
=
spilt
[
0
];
targetUrl
=
"https://www.stradivarius.cn/itxrest/2/catalog/store/55009578/50331061/category/0/product/"
+
pId
+
"/detail"
;
String
content
=
HttpClientUtil
.
getContentByUrl
(
targetUrl
,
PlatformEnum
.
STRADIVARIUS
.
getValue
());
JSONObject
resultObj
=
JSONObject
.
fromObject
(
content
);
ProductResponse
productResponse
=
formatProductResponse
(
resultObj
,
pId
);
resultObj
=
JSONObject
.
fromObject
(
productResponse
);
TranslateHelper
.
translateProductResponse
(
resultObj
);
return
resultObj
;
}
/**
* 格式化返回数据
* @param dataMap 主要的 Json 数据
* @param pId 商品 id
* @return 格式化后的数据
*/
private
ProductResponse
formatProductResponse
(
JSONObject
dataMap
,
String
pId
)
{
// 声明封装类
ProductResponse
productResponse
=
new
ProductResponse
();
// 含有商品的属性,设置为true
productResponse
.
setPropFlag
(
true
);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock
dynStock
=
new
DynStock
();
dynStock
.
setSellableQuantity
(
9999
);
List
<
ProductSkuStock
>
productSkuStockList
=
dynStock
.
getProductSkuStockList
();
// 产品的原始价与优惠价
List
<
OriginalPrice
>
originalPriceList
=
new
ArrayList
<>();
List
<
ProductPromotion
>
promotionList
=
new
ArrayList
<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map
<
String
,
Set
<
ProductProp
>>
productPropSet
=
new
HashMap
<>(
16
);
Set
<
ProductProp
>
propSet
=
new
HashSet
<>(
16
);
Set
<
ProductProp
>
sizePropSet
=
new
HashSet
<>(
16
);
productResponse
.
setStockFlag
(
true
);
// 商品的基本属性
ItemInfo
itemInfo
=
new
ItemInfo
();
// 取 detail 节点对象
JSONObject
detailObj
=
dataMap
.
getJSONObject
(
"detail"
);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo
.
setItemId
(
pId
);
itemInfo
.
setShopName
(
"Stradivarius"
);
itemInfo
.
setShopUrl
(
"https://www.stradivarius.cn/"
);
itemInfo
.
setTitle
(
detailObj
.
getString
(
"description"
));
//////////////////////////////////// 获取商品基本信息End /////////////////////////
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
// 取 colors 节点数组
JSONArray
colorsArr
=
detailObj
.
getJSONArray
(
"colors"
);
for
(
int
i
=
0
;
i
<
colorsArr
.
size
();
i
++)
{
JSONObject
colorsObj
=
colorsArr
.
getJSONObject
(
i
);
String
colorNo
=
colorsObj
.
getString
(
"id"
);
String
color
=
colorsObj
.
getString
(
"name"
);
String
imgUrl
=
"https://static.stradivarius.cn/5/photos3"
+
colorsObj
.
getJSONObject
(
"image"
).
getString
(
"url"
)
+
"_6_1_4.jpg?t="
+
colorsObj
.
getJSONObject
(
"image"
).
getString
(
"timestamp"
);
if
(
i
==
0
)
{
itemInfo
.
setPic
(
imgUrl
);
}
ProductProp
productPropColor
=
new
ProductProp
();
productPropColor
.
setPropId
(
colorNo
);
productPropColor
.
setPropName
(
color
);
productPropColor
.
setImage
(
imgUrl
);
propSet
.
add
(
productPropColor
);
if
(
productPropSet
.
get
(
"颜色"
)
==
null
)
{
productPropSet
.
put
(
"颜色"
,
propSet
);
}
else
{
Set
<
ProductProp
>
oldPropSet
=
productPropSet
.
get
(
"颜色"
);
propSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"颜色"
,
propSet
);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////
// 取每个 colors 的 sizes 对象数组
JSONArray
sizesArr
=
colorsObj
.
getJSONArray
(
"sizes"
);
for
(
int
j
=
0
;
j
<
sizesArr
.
size
();
j
++)
{
JSONObject
sizesObj
=
sizesArr
.
getJSONObject
(
j
);
String
sizeNo
=
sizesObj
.
getString
(
"sku"
);
String
size
=
sizesObj
.
getString
(
"name"
);
ProductProp
productPropSize
=
new
ProductProp
();
productPropSize
.
setPropId
(
sizeNo
);
productPropSize
.
setPropName
(
size
);
sizePropSet
.
add
(
productPropSize
);
if
(
productPropSet
.
get
(
"尺码"
)
==
null
)
{
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
else
{
Set
<
ProductProp
>
oldPropSet
=
productPropSet
.
get
(
"尺码"
);
sizePropSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
///////////////////////// 获取商品尺码属性 END///////////////////////////////////////////////////
//////////////////////////////////// 获取库存与原始价 ////////////////////////////////////////////
// 设置库存id
String
skuStr
=
";"
+
colorNo
+
";"
+
sizeNo
+
";"
;
if
(
productSkuStockList
==
null
)
{
productSkuStockList
=
new
ArrayList
<>();
}
ProductSkuStock
productSkuStock
=
new
ProductSkuStock
();
productSkuStock
.
setSkuStr
(
skuStr
);
productSkuStock
.
setSellableQuantity
(
999
);
productSkuStockList
.
add
(
productSkuStock
);
dynStock
.
setProductSkuStockList
(
productSkuStockList
);
// 获取商品的原始价
String
fullPrice
=
sizesObj
.
getString
(
"price"
);
BigDecimal
priceOld
=
new
BigDecimal
(
fullPrice
);
BigDecimal
div
=
new
BigDecimal
(
"100"
);
fullPrice
=
priceOld
.
divide
(
div
,
2
,
BigDecimal
.
ROUND_DOWN
).
toString
();
// TODO 转换汇率,目前商品单位是人民币
String
originalFullPrice
=
exchangeRate
(
fullPrice
);
OriginalPrice
originalPrice
=
new
OriginalPrice
();
originalPrice
.
setPrice
(
originalFullPrice
);
originalPrice
.
setSkuStr
(
skuStr
);
originalPriceList
.
add
(
originalPrice
);
productResponse
.
setPrice
(
originalFullPrice
);
productResponse
.
setSalePrice
(
originalFullPrice
+
"-"
+
originalFullPrice
);
//////////////////////////////////// 获取库存与原始价 END///////////////////////////////
}
}
productResponse
.
setProductPropSet
(
productPropSet
);
productResponse
.
setPlatform
(
"Stradivarius"
);
productResponse
.
setPromotionList
(
promotionList
);
productResponse
.
setOriginalPriceList
(
originalPriceList
);
productResponse
.
setItemInfo
(
itemInfo
);
productResponse
.
setDynStock
(
dynStock
);
return
productResponse
;
}
}
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UniqloSpider.java
浏览文件 @
da07cbe8
...
@@ -17,7 +17,6 @@ import java.net.URISyntaxException;
...
@@ -17,7 +17,6 @@ import java.net.URISyntaxException;
import
java.util.*
;
import
java.util.*
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.TimeoutException
;
import
java.util.concurrent.TimeoutException
;
import
java.util.regex.Pattern
;
/**
/**
* 优衣库数据爬虫
* 优衣库数据爬虫
...
...
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraHomeSpider.java
0 → 100644
浏览文件 @
da07cbe8
package
com
.
diaoyun
.
zion
.
chinafrica
.
bis
.
impl
;
import
com.diaoyun.zion.chinafrica.bis.IItemSpider
;
import
com.diaoyun.zion.chinafrica.enums.PlatformEnum
;
import
com.diaoyun.zion.chinafrica.vo.*
;
import
com.diaoyun.zion.master.util.HttpClientUtil
;
import
com.diaoyun.zion.master.util.TranslateHelper
;
import
net.sf.json.JSONArray
;
import
net.sf.json.JSONObject
;
import
org.slf4j.Logger
;
import
org.slf4j.LoggerFactory
;
import
org.springframework.stereotype.Component
;
import
java.io.IOException
;
import
java.math.BigDecimal
;
import
java.net.URISyntaxException
;
import
java.util.*
;
import
java.util.concurrent.ExecutionException
;
import
java.util.concurrent.TimeoutException
;
import
java.util.regex.Matcher
;
import
java.util.regex.Pattern
;
import
static
com
.
diaoyun
.
zion
.
master
.
util
.
SpiderUtil
.
exchangeRate
;
/**
* ZaraHome 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component
(
"zaraHomeSpider"
)
public
class
ZaraHomeSpider
implements
IItemSpider
{
private
static
Logger
logger
=
LoggerFactory
.
getLogger
(
ZaraHomeSpider
.
class
);
/**
* ZaraHome 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public
JSONObject
captureItem
(
String
targetUrl
)
throws
URISyntaxException
,
IOException
,
ExecutionException
,
InterruptedException
,
TimeoutException
{
Matcher
matcher
=
Pattern
.
compile
(
"p\\d+"
).
matcher
(
targetUrl
);
matcher
.
find
();
String
pId
=
matcher
.
group
().
substring
(
1
);
targetUrl
=
"https://www.zarahome.cn/itxrest/2/catalog/store/85009928/80290014/category/0/product/"
+
pId
+
"/detail"
;
String
content
=
HttpClientUtil
.
getContentByUrl
(
targetUrl
,
PlatformEnum
.
ZARAHOME
.
getValue
());
JSONObject
resultObj
=
JSONObject
.
fromObject
(
content
);
ProductResponse
productResponse
=
formatProductResponse
(
resultObj
,
pId
);
resultObj
=
JSONObject
.
fromObject
(
productResponse
);
TranslateHelper
.
translateProductResponse
(
resultObj
);
return
resultObj
;
}
/**
* 格式化 Zara 返回数据
* @param dataMap 主要的 json 数据
* @return 格式化后的数据
*/
private
ProductResponse
formatProductResponse
(
JSONObject
dataMap
,
String
pId
)
{
// 声明封装类
ProductResponse
productResponse
=
new
ProductResponse
();
// 含有商品的属性,设置为true
productResponse
.
setPropFlag
(
true
);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock
dynStock
=
new
DynStock
();
dynStock
.
setSellableQuantity
(
9999
);
List
<
ProductSkuStock
>
productSkuStockList
=
dynStock
.
getProductSkuStockList
();
// 产品的原始价与优惠价
List
<
OriginalPrice
>
originalPriceList
=
new
ArrayList
<>();
List
<
ProductPromotion
>
promotionList
=
new
ArrayList
<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map
<
String
,
Set
<
ProductProp
>>
productPropSet
=
new
HashMap
<>(
16
);
Set
<
ProductProp
>
propSet
=
new
HashSet
<>(
16
);
Set
<
ProductProp
>
sizePropSet
=
new
HashSet
<>(
16
);
productResponse
.
setStockFlag
(
false
);
// 商品的基本属性
ItemInfo
itemInfo
=
new
ItemInfo
();
//////////////////////////////////// 获取商品基本信息 ////////////////////////////////////////////
itemInfo
.
setShopName
(
"ZaraHome"
);
itemInfo
.
setItemId
(
pId
);
itemInfo
.
setShopUrl
(
"https://www.zarahome.cn/"
);
itemInfo
.
setTitle
(
dataMap
.
getString
(
"name"
));
//////////////////////////////////// 获取商品基本信息End(图片下取) ////////////////////////////////////////////
List
<
Double
>
priceList
=
new
ArrayList
<>();
// 取 colors 节点数组
JSONArray
colorsArr
=
dataMap
.
getJSONObject
(
"detail"
).
getJSONArray
(
"colors"
);
for
(
int
i
=
0
;
i
<
colorsArr
.
size
();
i
++)
{
JSONObject
colorsObj
=
colorsArr
.
getJSONObject
(
i
);
// 取 detailImagesArr 节点数组第一个对象
JSONObject
imageObj
=
colorsObj
.
getJSONObject
(
"image"
);
String
colorNo
=
colorsObj
.
getString
(
"id"
);
String
color
=
colorsObj
.
getString
(
"name"
);
String
imageUrl
=
"https://static.zarahome.cn/8/photos4"
+
imageObj
.
getString
(
"url"
)
+
"_1_1_2.jpg?t="
+
imageObj
.
getString
(
"timestamp"
);
if
(
i
==
0
)
{
itemInfo
.
setPic
(
imageUrl
);
}
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
ProductProp
productPropColor
=
new
ProductProp
();
productPropColor
.
setPropId
(
colorNo
);
productPropColor
.
setPropName
(
color
);
productPropColor
.
setImage
(
imageUrl
);
propSet
.
add
(
productPropColor
);
if
(
productPropSet
.
get
(
"颜色"
)
==
null
)
{
productPropSet
.
put
(
"颜色"
,
propSet
);
}
else
{
Set
<
ProductProp
>
oldPropSet
=
productPropSet
.
get
(
"颜色"
);
propSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"颜色"
,
propSet
);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
// 取 sizes 节点数组
JSONArray
sizesArr
=
colorsArr
.
getJSONObject
(
i
).
getJSONArray
(
"sizes"
);
for
(
int
j
=
0
;
j
<
sizesArr
.
size
();
j
++)
{
JSONObject
sizesObj
=
sizesArr
.
getJSONObject
(
j
);
///////////////////////// 获取商品尺码属性 ////////////////////
String
sizeNo
=
sizesObj
.
getString
(
"sku"
);
String
size
=
sizesObj
.
getString
(
"name"
);
ProductProp
productPropSize
=
new
ProductProp
();
productPropSize
.
setPropId
(
sizeNo
);
productPropSize
.
setPropName
(
size
);
sizePropSet
.
add
(
productPropSize
);
if
(
productPropSet
.
get
(
"尺码"
)
==
null
)
{
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
else
{
Set
<
ProductProp
>
oldPropSet
=
productPropSet
.
get
(
"尺码"
);
sizePropSet
.
addAll
(
oldPropSet
);
productPropSet
.
put
(
"尺码"
,
sizePropSet
);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 库存对应的id(Zara 中以颜色id + 尺码id)
String
skuStr
=
";"
+
colorNo
+
";"
+
sizeNo
+
";"
;
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
if
(
productSkuStockList
==
null
)
{
productSkuStockList
=
new
ArrayList
<>();
}
ProductSkuStock
productSkuStock
=
new
ProductSkuStock
();
productSkuStock
.
setSkuStr
(
skuStr
);
productSkuStock
.
setSellableQuantity
(
999
);
productSkuStockList
.
add
(
productSkuStock
);
dynStock
.
setProductSkuStockList
(
productSkuStockList
);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
String
fullPrice
=
sizesObj
.
getString
(
"price"
);
BigDecimal
priceOld
=
new
BigDecimal
(
fullPrice
);
BigDecimal
div
=
new
BigDecimal
(
"100"
);
BigDecimal
priceNew
=
priceOld
.
divide
(
div
,
2
,
BigDecimal
.
ROUND_DOWN
);
fullPrice
=
exchangeRate
(
priceNew
.
toString
());
priceList
.
add
(
Double
.
valueOf
(
fullPrice
));
OriginalPrice
originalPrice
=
new
OriginalPrice
();
originalPrice
.
setPrice
(
fullPrice
);
originalPrice
.
setSkuStr
(
skuStr
);
originalPriceList
.
add
(
originalPrice
);
productResponse
.
setPrice
(
fullPrice
);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
// 取存储的价格的最大值与最小值
Double
minPrice
=
Collections
.
min
(
priceList
);
Double
maxPrice
=
Collections
.
max
(
priceList
);
productResponse
.
setSalePrice
(
minPrice
+
"-"
+
maxPrice
);
productResponse
.
setProductPropSet
(
productPropSet
);
productResponse
.
setPlatform
(
"ZaraHome"
);
productResponse
.
setPromotionList
(
promotionList
);
productResponse
.
setOriginalPriceList
(
originalPriceList
);
productResponse
.
setItemInfo
(
itemInfo
);
productResponse
.
setDynStock
(
dynStock
);
return
productResponse
;
}
}
src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java
浏览文件 @
da07cbe8
...
@@ -55,19 +55,24 @@ public class ZaraSpider implements IItemSpider {
...
@@ -55,19 +55,24 @@ public class ZaraSpider implements IItemSpider {
private
ProductResponse
formatZaraProductResponse
(
JSONObject
dataMap
)
{
private
ProductResponse
formatZaraProductResponse
(
JSONObject
dataMap
)
{
// 声明封装类
// 声明封装类
ProductResponse
productResponse
=
new
ProductResponse
();
ProductResponse
productResponse
=
new
ProductResponse
();
// 属性:Zara 的商品属性有颜色、尺码
// 含有商品的属性,设置为true
Map
<
String
,
Set
<
ProductProp
>>
productPropSet
=
new
HashMap
<>(
16
);
productResponse
.
setPropFlag
(
true
);
// 原始价
// 库存信息,如果没有可使用的库存信息则默认为999
List
<
OriginalPrice
>
originalPriceList
=
new
ArrayList
<>();
// 促销价格
List
<
ProductPromotion
>
promotionList
=
new
ArrayList
<>();
// 库存
DynStock
dynStock
=
new
DynStock
();
DynStock
dynStock
=
new
DynStock
();
// 其实数据没有包含确切的库存数,这里默认给足量的库存
dynStock
.
setSellableQuantity
(
9999
);
dynStock
.
setSellableQuantity
(
9999
);
List
<
ProductSkuStock
>
productSkuStockList
=
dynStock
.
getProductSkuStockList
();
// 产品的原始价与优惠价
List
<
OriginalPrice
>
originalPriceList
=
new
ArrayList
<>();
List
<
ProductPromotion
>
promotionList
=
new
ArrayList
<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map
<
String
,
Set
<
ProductProp
>>
productPropSet
=
new
HashMap
<>(
16
);
Set
<
ProductProp
>
propSet
=
new
HashSet
<>(
16
);
Set
<
ProductProp
>
sizePropSet
=
new
HashSet
<>(
16
);
productResponse
.
setStockFlag
(
true
);
// 商品的基本属性
ItemInfo
itemInfo
=
new
ItemInfo
();
//////////////////////////////////// 获取商品基本信息 ////////////////////////////////////////////
//////////////////////////////////// 获取商品基本信息 ////////////////////////////////////////////
ItemInfo
itemInfo
=
new
ItemInfo
();
itemInfo
.
setShopName
(
"Zara"
);
itemInfo
.
setShopName
(
"Zara"
);
itemInfo
.
setShopUrl
(
dataMap
.
getString
(
"backUrl"
));
itemInfo
.
setShopUrl
(
dataMap
.
getString
(
"backUrl"
));
JSONObject
productObj
=
dataMap
.
getJSONObject
(
"product"
);
JSONObject
productObj
=
dataMap
.
getJSONObject
(
"product"
);
...
@@ -77,16 +82,14 @@ public class ZaraSpider implements IItemSpider {
...
@@ -77,16 +82,14 @@ public class ZaraSpider implements IItemSpider {
// 取 colors 节点数组
// 取 colors 节点数组
JSONArray
colorsArr
=
productObj
.
getJSONObject
(
"detail"
).
getJSONArray
(
"colors"
);
JSONArray
colorsArr
=
productObj
.
getJSONObject
(
"detail"
).
getJSONArray
(
"colors"
);
Set
<
ProductProp
>
sizePropSet
=
new
HashSet
<>(
16
);
Set
<
ProductProp
>
propSet
=
new
HashSet
<>(
16
);
productResponse
.
setStockFlag
(
true
);
List
<
ProductSkuStock
>
productSkuStockList
=
dynStock
.
getProductSkuStockList
();
for
(
int
i
=
0
;
i
<
colorsArr
.
size
();
i
++)
{
for
(
int
i
=
0
;
i
<
colorsArr
.
size
();
i
++)
{
JSONObject
colorsObj
=
colorsArr
.
getJSONObject
(
i
);
JSONObject
colorsObj
=
colorsArr
.
getJSONObject
(
i
);
// 取 detailImagesArr 节点数组第一个对象
// 取 detailImagesArr 节点数组第一个对象
JSONObject
detailImagesObj_0
=
colorsObj
.
getJSONArray
(
"detailImages"
).
getJSONObject
(
0
);
JSONObject
detailImagesObj_0
=
colorsObj
.
getJSONArray
(
"detailImages"
).
getJSONObject
(
0
);
// 处理图片 参考路径:http://static.zara.cn/photos///2019/I/0/1/p/0858/457/800/17/w/1920/0858457800_1_1_1.jpg?ts=1570720340221
String
colorNo
=
colorsObj
.
getString
(
"productId"
);
String
color
=
colorsObj
.
getString
(
"name"
);
String
imageUrl
=
"http://static.zara.cn/photos//"
String
imageUrl
=
"http://static.zara.cn/photos//"
+
detailImagesObj_0
.
getString
(
"path"
)
+
detailImagesObj_0
.
getString
(
"path"
)
+
"w/1920/"
+
"w/1920/"
...
@@ -95,14 +98,13 @@ public class ZaraSpider implements IItemSpider {
...
@@ -95,14 +98,13 @@ public class ZaraSpider implements IItemSpider {
+
detailImagesObj_0
.
getString
(
"timestamp"
);
+
detailImagesObj_0
.
getString
(
"timestamp"
);
if
(
i
==
0
)
{
if
(
i
==
0
)
{
// 商品基本信息--设置:图片
itemInfo
.
setPic
(
imageUrl
);
itemInfo
.
setPic
(
imageUrl
);
}
}
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
ProductProp
productPropColor
=
new
ProductProp
();
ProductProp
productPropColor
=
new
ProductProp
();
productPropColor
.
setPropId
(
color
sObj
.
getString
(
"productId"
)
);
productPropColor
.
setPropId
(
color
No
);
productPropColor
.
setPropName
(
color
sObj
.
getString
(
"name"
)
);
productPropColor
.
setPropName
(
color
);
productPropColor
.
setImage
(
imageUrl
);
productPropColor
.
setImage
(
imageUrl
);
propSet
.
add
(
productPropColor
);
propSet
.
add
(
productPropColor
);
if
(
productPropSet
.
get
(
"颜色"
)
==
null
)
{
if
(
productPropSet
.
get
(
"颜色"
)
==
null
)
{
...
@@ -119,13 +121,14 @@ public class ZaraSpider implements IItemSpider {
...
@@ -119,13 +121,14 @@ public class ZaraSpider implements IItemSpider {
for
(
int
j
=
0
;
j
<
sizesArr
.
size
();
j
++)
{
for
(
int
j
=
0
;
j
<
sizesArr
.
size
();
j
++)
{
JSONObject
sizesObj
=
sizesArr
.
getJSONObject
(
j
);
JSONObject
sizesObj
=
sizesArr
.
getJSONObject
(
j
);
// 库存对应的id(Zara 中以颜色id + 尺码id)
String
skuStr
=
";"
+
colorsObj
.
getString
(
"productId"
)
+
";"
+
sizesObj
.
getString
(
"sku"
)
+
";"
;
///////////////////////// 获取商品尺码属性 ////////////////////
///////////////////////// 获取商品尺码属性 ////////////////////
ProductProp
productPropSize
=
new
ProductProp
();
String
sizeNo
=
sizesObj
.
getString
(
"sku"
);
String
size
=
sizesObj
.
getString
(
"name"
);
String
size
=
sizesObj
.
getString
(
"name"
);
productPropSize
.
setPropId
(
sizesObj
.
getString
(
"sku"
));
ProductProp
productPropSize
=
new
ProductProp
();
productPropSize
.
setPropId
(
sizeNo
);
productPropSize
.
setPropName
(
size
);
productPropSize
.
setPropName
(
size
);
sizePropSet
.
add
(
productPropSize
);
sizePropSet
.
add
(
productPropSize
);
if
(
productPropSet
.
get
(
"尺码"
)
==
null
)
{
if
(
productPropSet
.
get
(
"尺码"
)
==
null
)
{
...
@@ -137,44 +140,42 @@ public class ZaraSpider implements IItemSpider {
...
@@ -137,44 +140,42 @@ public class ZaraSpider implements IItemSpider {
}
}
///////////////////////// 获取商品尺码属性 END////////////////////
///////////////////////// 获取商品尺码属性 END////////////////////
// 库存对应的id(Zara 中以颜色id + 尺码id)
String
skuStr
=
";"
+
colorNo
+
";"
+
sizeNo
+
";"
;
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if
(
productSkuStockList
==
null
)
{
if
(
productSkuStockList
==
null
)
{
productSkuStockList
=
new
ArrayList
<>();
productSkuStockList
=
new
ArrayList
<>();
}
}
ProductSkuStock
productSkuStock
=
new
ProductSkuStock
();
ProductSkuStock
productSkuStock
=
new
ProductSkuStock
();
// 设置:可用库存值,Zara 未有可用的库存数据
productSkuStock
.
setSellableQuantity
(
999
);
productSkuStock
.
setSkuStr
(
skuStr
);
productSkuStock
.
setSkuStr
(
skuStr
);
productSkuStock
.
setSellableQuantity
(
999
);
productSkuStockList
.
add
(
productSkuStock
);
productSkuStockList
.
add
(
productSkuStock
);
dynStock
.
setProductSkuStockList
(
productSkuStockList
);
dynStock
.
setProductSkuStockList
(
productSkuStockList
);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice
originalPrice
=
new
OriginalPrice
();
// 获取商品的原始价
String
fullPrice
=
sizesObj
.
getString
(
"price"
);
String
fullPrice
=
sizesObj
.
getString
(
"price"
);
BigDecimal
priceOld
=
new
BigDecimal
(
fullPrice
);
BigDecimal
priceOld
=
new
BigDecimal
(
fullPrice
);
BigDecimal
div
=
new
BigDecimal
(
"100"
);
BigDecimal
div
=
new
BigDecimal
(
"100"
);
BigDecimal
priceNew
=
priceOld
.
divide
(
div
,
2
,
BigDecimal
.
ROUND_DOWN
);
BigDecimal
priceNew
=
priceOld
.
divide
(
div
,
2
,
BigDecimal
.
ROUND_DOWN
);
// TODO 转换汇率,目前商品单位是人民币
fullPrice
=
exchangeRate
(
priceNew
.
toString
());
fullPrice
=
exchangeRate
(
priceNew
.
toString
());
OriginalPrice
originalPrice
=
new
OriginalPrice
();
originalPrice
.
setPrice
(
fullPrice
);
originalPrice
.
setPrice
(
fullPrice
);
originalPrice
.
setSkuStr
(
skuStr
);
originalPriceList
.
add
(
originalPrice
);
productResponse
.
setPrice
(
fullPrice
);
productResponse
.
setPrice
(
fullPrice
);
productResponse
.
setSalePrice
(
fullPrice
+
"-"
+
fullPrice
);
productResponse
.
setSalePrice
(
fullPrice
+
"-"
+
fullPrice
);
originalPrice
.
setSkuStr
(
skuStr
);
originalPriceList
.
add
(
originalPrice
);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
}
}
// 按照一下顺序进行 json 数据的填充
// 按照一下顺序进行 json 数据的填充
productResponse
.
setPropFlag
(
true
);
productResponse
.
setProductPropSet
(
productPropSet
);
productResponse
.
setProductPropSet
(
productPropSet
);
productResponse
.
setPlatform
(
PlatformEnum
.
ZARA
.
getValue
()
);
productResponse
.
setPlatform
(
"Zara"
);
productResponse
.
setPromotionList
(
promotionList
);
productResponse
.
setPromotionList
(
promotionList
);
productResponse
.
setOriginalPriceList
(
originalPriceList
);
productResponse
.
setOriginalPriceList
(
originalPriceList
);
productResponse
.
setItemInfo
(
itemInfo
);
productResponse
.
setItemInfo
(
itemInfo
);
...
...
src/main/java/com/diaoyun/zion/chinafrica/enums/PlatformEnum.java
浏览文件 @
da07cbe8
...
@@ -35,6 +35,7 @@ public enum PlatformEnum implements EnumItemable<PlatformEnum> {
...
@@ -35,6 +35,7 @@ public enum PlatformEnum implements EnumItemable<PlatformEnum> {
COACH
(
"蔻驰"
,
"coach"
),
COACH
(
"蔻驰"
,
"coach"
),
REVOLVE
(
"Revolve"
,
"revolve"
),
REVOLVE
(
"Revolve"
,
"revolve"
),
VANS
(
"范斯"
,
"vans"
),
VANS
(
"范斯"
,
"vans"
),
ZARAHOME
(
"ZaraHome"
,
"zaraHome"
),
OYSHO
(
"Oysho"
,
"oysho"
),
OYSHO
(
"Oysho"
,
"oysho"
),
STRADIVARIUS
(
"斯特拉迪瓦里斯"
,
"stradivarius"
),
STRADIVARIUS
(
"斯特拉迪瓦里斯"
,
"stradivarius"
),
MAJE
(
"Maje"
,
"maje"
),
MAJE
(
"Maje"
,
"maje"
),
...
...
src/main/java/com/diaoyun/zion/chinafrica/factory/ItemSpiderFactory.java
浏览文件 @
da07cbe8
...
@@ -107,6 +107,10 @@ public class ItemSpiderFactory {
...
@@ -107,6 +107,10 @@ public class ItemSpiderFactory {
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"vansSpider"
);
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"vansSpider"
);
break
;
break
;
}
}
case
"zaraHome"
:
{
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"zaraHomeSpider"
);
break
;
}
case
"oysho"
:
{
case
"oysho"
:
{
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"oyshoSpider"
);
iItemSpider
=
(
IItemSpider
)
SpringContextUtil
.
getBean
(
"oyshoSpider"
);
break
;
break
;
...
...
src/main/java/com/diaoyun/zion/chinafrica/service/impl/SpiderServiceImpl.java
浏览文件 @
da07cbe8
...
@@ -86,6 +86,8 @@ public class SpiderServiceImpl implements SpiderService {
...
@@ -86,6 +86,8 @@ public class SpiderServiceImpl implements SpiderService {
platformEnum
=
PlatformEnum
.
URBANREVIVO
;
platformEnum
=
PlatformEnum
.
URBANREVIVO
;
}
else
if
(
targetUrl
.
contains
(
"abercrombie"
)
&&
Pattern
.
matches
(
"^.*abercrombie.*\\/anf-\\d{6,}.*$"
,
targetUrl
))
{
}
else
if
(
targetUrl
.
contains
(
"abercrombie"
)
&&
Pattern
.
matches
(
"^.*abercrombie.*\\/anf-\\d{6,}.*$"
,
targetUrl
))
{
platformEnum
=
PlatformEnum
.
ABERCROMBIEFITCH
;
platformEnum
=
PlatformEnum
.
ABERCROMBIEFITCH
;
}
else
if
(
targetUrl
.
contains
(
"underarmour"
)
&&
Pattern
.
matches
(
"^.*underarmour.*\\/p\\d+-\\d+.*$"
,
targetUrl
))
{
platformEnum
=
PlatformEnum
.
UNDERARMOUR
;
}
else
if
(
targetUrl
.
contains
(
"ochirly"
)
&&
Pattern
.
matches
(
"^.*ochirly.*\\/p\\/.*\\w{10,}.*$"
,
targetUrl
))
{
}
else
if
(
targetUrl
.
contains
(
"ochirly"
)
&&
Pattern
.
matches
(
"^.*ochirly.*\\/p\\/.*\\w{10,}.*$"
,
targetUrl
))
{
platformEnum
=
PlatformEnum
.
OCHIRLY
;
platformEnum
=
PlatformEnum
.
OCHIRLY
;
}
else
if
(
targetUrl
.
contains
(
"esprit"
)
&&
Pattern
.
matches
(
"^.*esprit.*\\/product\\/\\w{24,}.html.*$"
,
targetUrl
))
{
}
else
if
(
targetUrl
.
contains
(
"esprit"
)
&&
Pattern
.
matches
(
"^.*esprit.*\\/product\\/\\w{24,}.html.*$"
,
targetUrl
))
{
...
@@ -102,6 +104,8 @@ public class SpiderServiceImpl implements SpiderService {
...
@@ -102,6 +104,8 @@ public class SpiderServiceImpl implements SpiderService {
platformEnum
=
PlatformEnum
.
REVOLVE
;
platformEnum
=
PlatformEnum
.
REVOLVE
;
}
else
if
(
targetUrl
.
contains
(
"vans"
)
&&
Pattern
.
matches
(
"^.*vans.com.*\\/product-\\d+.*$"
,
targetUrl
))
{
}
else
if
(
targetUrl
.
contains
(
"vans"
)
&&
Pattern
.
matches
(
"^.*vans.com.*\\/product-\\d+.*$"
,
targetUrl
))
{
platformEnum
=
PlatformEnum
.
VANS
;
platformEnum
=
PlatformEnum
.
VANS
;
}
else
if
(
targetUrl
.
contains
(
"zarahome"
)
&&
Pattern
.
matches
(
"^.*zarahome.*\\/.*c\\d+p\\d+.html.*$"
,
targetUrl
))
{
platformEnum
=
PlatformEnum
.
ZARAHOME
;
}
else
if
(
targetUrl
.
contains
(
"oysho"
)
&&
Pattern
.
matches
(
"^.*oysho.*\\/.*-c\\d+p\\d+.html\\?origenId=\\d+$"
,
targetUrl
))
{
}
else
if
(
targetUrl
.
contains
(
"oysho"
)
&&
Pattern
.
matches
(
"^.*oysho.*\\/.*-c\\d+p\\d+.html\\?origenId=\\d+$"
,
targetUrl
))
{
platformEnum
=
PlatformEnum
.
OYSHO
;
platformEnum
=
PlatformEnum
.
OYSHO
;
}
else
if
(
targetUrl
.
contains
(
"stradivarius"
)
&&
Pattern
.
matches
(
"^.*stradivarius.*\\/.*c\\d+p\\d+.html.*$"
,
targetUrl
))
{
}
else
if
(
targetUrl
.
contains
(
"stradivarius"
)
&&
Pattern
.
matches
(
"^.*stradivarius.*\\/.*c\\d+p\\d+.html.*$"
,
targetUrl
))
{
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论