首页 > 其他 > 详细

【2014-11-23】Heterogeneous Parallel Programming – Section 1

时间:2014-11-23 22:49:11      阅读:255      评论:0      收藏:0      [点我收藏+]
  1. Latency devices(CPU cores)
  2. Throughput devices(GPU cores)
  3. Use the best match for the job (heterogeneity in mobile SOC
  4. bubuko.com,布布扣
  5. bubuko.com,布布扣
  6. CPU: Latency Oriented Design
    • Powerful ALU
      • Reduced operation latency
    • Large caches
      • convert long latency memory accesses to short latency cache accesses
    • Sophisticated control
      • Branch prediciton for reduced branch latency
      • Data forwarding for reduced data latency
  7. GPU: Throughput Oriented Design
    • Small caches
      • To boost memory throughput
    • Simple control
      • No branch prediction
      • No data forwarding
    • Energy efficient ALUs
      • Many long latency but heavily pipelined for high throughput
  8. Scalability
    • bubuko.com,布布扣
  9. Portability
    • bubuko.com,布布扣
  10. SPMD – Single Program, Multiple Data
  11. Threads within a block cooperate via shared memory, atomic operation, barrier synchronization
  12. bubuko.com,布布扣

【2014-11-23】Heterogeneous Parallel Programming – Section 1

原文:http://www.cnblogs.com/sjtujoe/p/4117512.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!